Screening and engineering method of super-stable immunoglobulin variable domains and their uses

ABSTRACT

There are provided a method named Tat-associated protein engineering (TAPE), of screening a target protein having higher solubility and excellent thermostability, in particular, an immunoglobulin variable domain (VH or VL) derived from human germ cells, by preparing a gene construct where the target protein and an antibiotic-resistant protein are linked to a Tat signal sequence, and then expressing this within  E. coli , and human or engineered VH and VL domain antibodies and human or engineered VH and VL domain antibody scaffolds having solubility and excellent thermostability, which are screened by the TAPE method. There are also provided a library including random CDR sequences in the human or engineered VH or VL domain antibody scaffold screened by the TAPE method, and a preparing method thereof. There are also provided a VH or VL domain antibody having binding ability to the target protein screened by using the library, and a pharmaceutical composition including the domain antibody.

CROSS-REFERENCE TO RELATED APPLICATIONS

This application is the U.S. National Phase under U.S.C. §371 ofInternational Application PCT/KR2012/006680, filed Aug. 22, 2012.

REFERENCE TO SEQUENCE LISTING

This application incorporates by reference the sequence listingsubmitted as ASCII text filed via EFS-Web on Feb. 20, 2015. The SequenceListing is provided as a file entitled “5TIP009_(—)001APC_ST25.txt,”created on Feb. 18, 2015, and which is approximately 127 kilobytes insize.

TECHNICAL FIELD OF THE INVENTION

The following disclosure relates to a method named Tat-associatedprotein engineering (TAPE), of screening a target protein having ahigher solubility and excellent thermostability, in particular, animmunoglobulin variable domain (VH or VL) derived from human germ-line,by fusing the target protein and an antibiotic-resistant protein to aTat signal sequence and expressing this within E. coli. This inventionalso relates to human heavy chain variable domain antibody (hereinafter“VH domain antibody”) and light chain variable domain antibody(hereinafter “VL domain antibody”) and human, or engineered VH and VLdomain antibody scaffolds having excellent solubility andthermostability, which are screened by the TAPE method. Also, thefollowing disclosure relates to amino acid sequences of the VH and VLdomain antibodies and the antibody scaffolds, and polynucleotidesencoding the amino acid sequences.

In the case where the VH or VL domain antibody containing the human, orengineered VH or VL scaffold screened according to the present inventionhas a corresponding human, or engineered VH or VL domain antibodyscaffold regardless of CDR sequences, it still retains solubility andthermostability.

Further, the following disclosure relates to a library including randomCDR sequences in the human, or engineered VH or VL domain antibodyscaffold screened by the TAPE method, and a preparing method thereof.

Further, the following disclosure relates to a VH or VL domain antibodyhaving binding ability toward the target protein screened by using thelibrary, an amino acid sequence of the domain antibody, and apolynucleotide encoding the amino acid sequence.

BACKGROUND OF THE INVENTION

A fragmented small-size antibody is a promising antibody capable ofovercoming the limitations of the existing antibody therapeutics becauseof its physicochemical properties different from the full-lengthmonoclonal antibody (mAb). There are a single chain antibody (scFv), aFab (fragment antibody-binding) antibody, an immunoglobulin variabledomain antibody such as VH or VL, and the like in general antibodyfragments, and a tandem scFv, a diabody, a minibody, and the like intheir modification forms (Better et al., Science 1988 240(4855):1041-3;Huston et al., Proc. Natl. Acad. Sci. USA. 1988 85(16):5879-83; Bird etal., Science 1988 242(4877):423-6; Pei et al., Proc. Natl. Acad. Sci.USA. 1997 94(18):9637-42; Iliades et al., FEBS Lett. 1997 409(3):437-41;Ward et al., Nature 1989 341(6242):544-6).

A antibody fragment or a small-size antibody mainly loses functions ofFc (crystallizable fragment) as compared with the full-length monoclonalantibody, and thus, there are no anticipated effects due to theexistence of Fc, such as, an increased circulating half-life, aneffector function, and the like.

However, the small-size fragmented antibody is being magnified as a nextgeneration antibody capable of overcoming the limitations, such aslimitations on accessibility to epitope structurally hidden, drugpenetration and biodistribution, format flexibility, high productioncosts, and the like, which result from a large size of the existingwhole antibody (Zhao et al., Blood 2007 110(7):2569-77; Holliger et al.,Nat. Biotechnol. 2005 23(9):1126-36; Hudson et al., Med. Microbiol.Immunol. 2009 198(3):157-74; Enever et al., Curr. Opin. Biotechnol. 200920(4):405-11).

Furthermore, various kinds of antibody fragments and small-sizeantibodies have an advantage in that dual or multiple-specificity can berealized by connection in a chemical method or a recombinant proteinfusion method.

Recently, this advantage has been utilized to introduce themultiple-specificity such that the Fc function is modularized, andthereby, supplementing the effector function and a short circulatinghalf-life, which has been pointed out as disadvantages. For example, afragment or a small-size antibody having binding specificity towardhuman serum albumin is introduced into a module to realize adual-specificity antibody, thereby increasing the circulating half-life,or a small-size antibody having specific affinity to an immune cell,such as a natural killer or a T cell, is introduced into a module,thereby conferring a cell killing function thereto (Els et al., J BiolChem. 2001 9; 276(10):7346-50; Bargou et al., Science 2008321(5891):974-7; et al., Mol. Cancer Ther. 2008 7(8):2288-97). Also,since a single antibody can be designed to confer specificity to two ormore molecules targets anticipating different action modes, thepossibility that efficacy and economic feasibility of the antibody aresignificantly improved is opened.

A smallest unit of human antibody structure that has an antigen-specificbinding function is a heavy chain variable domain (VH) or a light chainvariable domain (VL), which is a variable domain positioned at theN-terminal of a light chain or a heavy chain. Since respective twoN-terminals have been evolved to have a complementary structure, VH andVL constitute a non-covalent binding type complex in the procedure ofassembling the heavy chain and the light chain when a monoclonalantibody is produced from a plasma B cell, and thereby maintainstructural stability thereof. Human antibody variable domain VH segmentsare classified into 7 families (VH1, VH2, VH3, VH4, VH5, VH6, VH7)depending on the homology of amino acid sequences in a frame portion,excluding CDRs (complementarity determining regions) binding to epitope,and each family contains three to twenty-two kinds of distinct aminoacid sequences. VLs of the light chain are divided into V kappa and Vlamda, and V kappa is classified into six families and V lamda isclassified into ten families (Chothia et al., 1992 J. Mol. Biol. 227,799-917; Tomlinson et al., 1995 EMBO J. 14, 4628-4638; Williams et al.,J. Mol. Biol. 264, 220-232). It has been known that a number of VH andVL have preferred VH/VL pairing combinations depending on the degree ofmutual affinity, and thus, it has been known that this combinatorialrearrangement of genes has an important role in enlarging diversity ofantibody repertoire (Ruud et al., J. Mol. Biol. 1999, 285, 895-901).

A bound type of VH/VL confers complementary binding specificity to aparticular antigen according to combinations of 6 CDRs. CDR1 CDR2, andCDR3 of the light chain and CDR1 CDR2, and CDR3 of the heavy chain,which are a total of 6 CDRs, participate in binding to the antigen.According to analyses of human germ-line sequence, it was found that avariety of the respective CDRs mostly depend on CDR3 of the heavy chainvariable domain. Therefore, this analysis implies that the bindingspecificity of an antibody mostly depends on variability of the heavychain CDR3 (J. Mol. Recogni. 2000, 13, 167-187).

Unlike this, animals such as a camel and a llama and fish having acartilage backbone such as a shark have antibodies of a single heavychain structure without a light chain structure. Therefore, a variabledomain of theses antibodies include only a single heavy chain variabledomain (V_(HH) and V_(NAR) for camel and shark, respectively), and it isknown that this antibody is no less competent than the human antibodywhere VH and VL simultaneously participate in binding to antigens, inview of binding to antigens and a neutralizing function. VH or VL aloneis rarely present, except in human patients having heavy chain diseases(Hendershot et al., J. Cell. Biol. 1987 104(3):761-7; Prelli et al., J.Immunol. 1992 148 (3): 949-52). The reason is that VH or VL is notstructurally stable at the time of separation of VH or VL alone due tostructural complementarity thereof, and thus, protein aggregation mayeasily occur. It is known that this protein aggregation partiallyresults from hydrophobic interaction caused by distribution ofhydrophobic amino acid residues mainly at an interface of VH and VL. Inthe case of camel antibody, amino acid residues having hydrophilicitymay be specifically positioned on the surface of VH/VL border region,unlike human antibody. Particularly, amino acids at four sites of thecamel antibody, which are specifically different from those of a humanVH3 family, are called a tetrad. These amino acids are positioned at 37,44, 45, and 47 in a Kabat numbering system (Kabat et al., 1991 J.Immunol. 147(5), 1709-1719). This difference in the amino acid sequencemay explain stability of a single variable domain antibody (VHH). Therewas an attempt to produce improved, camelid antibodies by replacingamino acids at tetrad positions with hydrophilic amino acids of thecamel antibody (G44E/L45R/W47G) in the human variable domain antibody.

As a result, solubility thereof may be somewhat improved in view ofphysical and chemical properties (Coppieters et al., Arthritis Rheum.2006 54(6):1856-66; Dolk et al., Proteins. 2005 59(3):555-64; Ewert etal., Biochem. 2002 41(11):3628-36; Kortt et al., J. Protein Chem. 199514(3):167-78; Martin et al., Protein Eng. 1997 10(5):607-14). However,stability thereof is difficult to obtain as compared with the camelantibody, for example, decreased protein expression yield andthermostability (Davies et al., FEBS Lett. 1994 Feb. 21; 339(3):285-90;Aires et al., J. Mol. Biol. 2004 340(3):525-42). It has been found thatthe reason therefor was that modification of amino acids at the VH/VLborder region causes modification in a beta-sheet structure of thecorresponding region (Riechmann et al., J. Mol. Biol. 1996259(5):957-69). The CDR3 of the camel single domain antibody has anabnormally long loop structure as compared with the human antibody.According to structural analysis, it was found that this loop structurefolds into the VH/VL border region of the human antibody, and it hasbeen suggested that this distinct structure partially shields ahydrophobic patch positioned at the border region thereby helpingstabilization of the camel single domain antibody (Joost et al., 2010Drug Discovery Today: Technologies 7(2), 139-146).

This shielding effect is hardly anticipated in the human antibody due toa relatively short loop structure of CDR3. In conclusion, the humansingle variable domain itself has deficient physical and chemicalproperties as compared with the camel single domain antibody, andthereby is not sufficient to be utilized as a scaffold of a bindingligand to a particular antigen. As the method for overcoming this, merereplacement of tetrad amino acids which are structural signatures of thecamel antibody is not sufficient, and protein structure design anddirected evolution of VH or VL are further needed.

A human immunoglobulin variable domain (VH or VL) that exists in natureis a minimum-size antibody ( 1/12 the size of monoclonal antibody)capable of maintaining an antigen binding characteristic, and thus, isanticipated to be different from the conventional monoclonal antibody inview of physical properties and therapeutic effects as a therapeuticprotein. Hence, a demand for developing human antibodies having only onevariable domain has increased. Nevertheless, aggregation and unstabletendency of protein when VH or VL alone exists still remain as majorobstacles that should be overcome in developing a binding scaffold withrespect to a specific antigen.

Accordingly, in order that an antibody fragment and a small-sizeantibody provide advantages that cannot be achieved by generalmonoclonal antibodies and stay competitive themselves, it is importantto secure robust pharmaceutical and physicochemical properties of asubstance itself.

Some molecular directed evolution methods have been attempted also inthe prior art so as to stabilize human heavy chain or light chainvariable domains (Barthelemy et al., J. Biol. Chem. 2008283(6):3639-54). They constructed a phage display system with aCDR-engineered library of VH, and then screened VHs having bindingactivity toward the protein A after applying thermal stress. There was areport that CDR engineered human VH having increased solubility, andallowing reversible folding after protein thermal denaturation wasscreened by this method (Jespers et al., Nat. Biotechnol. 200422(9):1161-5). Also, there was a report that various libraries wheremutations were induced at a CDR3 portion and a frame portion withoutthermal denaturation treatment were prepared, and VH exhibiting highbinding activity toward protein A after phage display was screened bythe same method, and thus, an engineered VH that is thermodynamicallystable and has increased soluble expression as compared with a wild typeVH can be obtained (Barthelemy et al., J Biol Chem. 2008283(6):3639-54). In the phage display system, the target protein isinduced to a Sec pathway by a Sec signal sequence of pelB protein fusedto N-terminal of the target protein. However, in this case, the protein,which is previously folded within the cytoplasm of E. coli cannot passthrough the pathway due to the limitation of an inherent translocationpathway of a protein. The reason is that a general phage display uses aSec pathway, which is a representative protein translocation pathway ofE. coli, and, due to the nature of this pathway, target protein has alinear structure not a three-dimensional structure with the help ofchaperon within the cytoplasm when passing through a cell membrane. Secpathway-specific proteins that naturally exist, distinctively in alinear form, without being folded, with the help of a chaperon calledSecB within the cytoplasm, directly after protein transcription. The Secpathway target protein, which is moved to a translocase complex,consisting of Sec A, SecYEG, and SecDFYajC existing on an intracellularmembrane by Sec B, passes through the membrane in a linear form, withoutbeing in a three-dimensional structure, and the passed amino acid chainforms a complete three-dimensional structure, including a disulfidelinkage, by oxidation and reduction of DsbA and DsbB until it arrives atthe periplasm (Baneyx and Mujacic Nature Biotech. 2004, 22, 1399˜1408).Therefore, if folding and three-dimensional structure formation ofcertain a protein quickly occurs in the cytoplasm due to the nature ofthe protein itself, this protein does not have compatibility with aphage display screening system designed to the Sec pathway.

In addition, it was reported that the wild type VH having improvedphysiochemical properties could be selected when clones are directlyscreened from a plate spread with bacterial lawn based on the size of aplaque size (To et al., J. Biol. Chem. 2005 280(50):41395-403). However,large-scale treatment is impossible by plate-based screening, and thus,in order to reduce the size of the library, an initial library for onlythe VH3 family subjected to a protein A screening procedure in vitro wasmanufactured.

Meanwhile, in order to improve folding characteristics of therecombinant protein, a genetic selection method was attempted (Maxwellet al., Protein Sci. 1999 8(9):1908-11; Wigley et al., Nat. Biotechnol.2001 19(2):131-6; Cabantous et al., Nat Biotechnol. 2005 23(1):102-7;Waldo G S. Curr. Opin. Chem. Biol. 2003 7(1):33-8). One of therepresentative methods for improving folding characteristics of therecombinant protein is that the folding degree of a protein of interestis indirectly determined by measuring activity of a reporter proteinfused to the protein of interest in a recombinant DNA technology.However, the folding cannot be accurately reflected when the protein ofinterest exists alone.

In addition, in order to increase solubility of the protein, there hasbeen developed a molecular directed evolution method where a Tat(twin-arginine translocation) pathway, which is a protein translocationpathway having a function of proof-reading folding quality of proteins,is utilized as a biological filter of determining whether or not theprotein is folded. Specifically, the protein of interest is fused to areporter gene and a Tat signal sequence and expressed by the Tat pathwaywithin Escherichia coli, and then is subjected to a protein foldingproof-reading by a Tat ABC translocase complex according to foldingdegrees and solubility of the protein. If the target protein hassufficient solubility, a fusion protein consisting of the target proteinand the reporter protein passes through an inner membrane of Escherichiacoli and reaches the periplasm. The fusion protein reaching theperiplasm is detected by a method such as antibiotic resistancemeasurement or the like, thereby screening proteins having a desireddegree of solubility (Fisher et al., Protein Sci. 2006 15(3):449-58). Itcan be seen that, when the recombinant protein, not only a Tat pathwaysubstrate protein in a natural system, is applied to the Tat pathway,but it also significantly passes through the Tat pathway in proportionalto solubility and stability of recombinant proteins (Lim et al., ProteinSci. 2009 18(12):2537-49). In addition, it has been reported that asingle chain antibody (scFv) allowing protein folding within thecytoplasm of E. coli is effectively screened by using the Tat pathway(Fisher A C and DeLisa M P. J Mol Biol. 2009 385(1): 299-311). Accordingto the above document, molecular directed evolution was completelyachieved in vitro by using scFv13 that is insoluble in and expressed inE. coli as a template base sequence. A disulfide bond presents withinscFv has a level of about 4 to 6 kcal/mol, and contributes tostabilization of protein molecules. This bond is formed in an oxidizingenvironment such as the periplasm of bacteria or the endoplasmicrecticulum (ER) of eukaryote. The periplasm of bacteria usuallymaintains oxidation conditions through the flow of electrons betweenDsbA and DsbB present on an inner membrane. Therefore, in the case ofscFv13 protein selected from the scFv-engineered library by artificiallypassing through the Tat pathway, intrabodies, which are autonomouslyself-folded without forming a disulfide bond within reductionconditional cytoplasm but not oxidation conditional cytoplasm, arepreferentially selected. Specifically, when a gene where a signalsequence leading the protein to the Tat pathway is fused to N-terminalof the target protein and TEM-1 beta-lactamase is fused to a C-terminalof the target protein, for functioning as a reporter gene, is expressedwithin E. coli, a triple-function fusion protein (tripartide) isexpressed.

The expressed fusion protein heads for the Tat pathway, and is subjectedto a protein folding inspection by machinery of a Tat ABC translocasecomplex existing on the inner cell membrane. Several studies found that,among many recombinant proteins, only those having solubility keepscompatibility with specific machineries of the Tat pathway (Sanders etal., Mol. Microbiol. 2001 41(1):241-6; DeLisa et al., Proc. Natl. Acad.Sci. USA. 2003 100(10):6115-20; Matos et al., EMBO J. 200827(15):2055-63; Fisher A C and DeLisa M P. J. Mol. Biol. 2009 385(1):299-311; Lim et al., Protein Sci. 2009 18(12): 2537-49).

However, the above method for improving the folding characteristics ofprotein has been never applied in selecting domain antibodies,particularly, VH or VL domain antibodies or the like.

In conclusion, presently, engineered modification and screening of humanVH domain antibody were unexceptionally conducted based on phage displayand binding activity to protein A (Kristensen P and Winter G. Fold. Des.1998 3(5):321-8; Sieber et al., Nat. Biotechnol. 1998 16(10):955-60;Jung et al., J. Mol. Biol. 1999 294(1):163-80; Worn A and Plückthun A.J. Mol. Biol. 2001 305(5): 989-1010).

Therefore, methods of selecting VH domain antibodies having moreefficient solubility and high thermostability are desperately in need ofdevelopment, and further, the smallest unit next generation antibodieshaving improved efficacy by utilizing the selected domain antibodies arepromptly in need of development.

SUMMARY OF THE INVENTION

An embodiment of the present invention is directed to providing a methodnamed Tat-associated protein engineering (TAPE), capable of efficientlyscreening a VH or VL domain antibody having solubility and highthermostability.

Further, an embodiment of the present invention is directed to providinghuman VH and VL domain antibodies and human or engineered VH and VLdomain antibody scaffolds, that have solubility and excellentthermostability, which are screened by the TAPE method, and providingamino acid sequences of the VH and VL domain antibodies and the antibodyscaffolds, and polynucleotides encoding the same.

In the case where the VH or VL domain antibody containing the human orengineered VH or VL scaffold screened according to the present inventionhas a corresponding human or engineered VH or VL domain antibodyscaffold regardless of CDR sequences, it still retains solubility andthermostability.

Further, an embodiment of the present invention is directed to providinga library including random CDR sequences in the human VH or VL domainantibody scaffold screened by the TAPE method, and a preparing methodthereof.

Further, an embodiment of the present invention is directed to providinga VH or VL domain antibody having binding ability to the target proteinscreened by using the library, an amino acid sequence of the domainantibody, and a polynucleotide encoding the same.

DETAILED DESCRIPTION OF THE INVENTION

In one general aspect, the present invention provides a method namedTat-associated protein engineering (TAPE), capable of efficientlyscreening ligands, particularly, VH and VL domain antibodies having highsolubility and high thermostability from human immunoglobulin variabledomain libraries or combinatorial libraries.

Also, the present invention provides a system, a vector, and a host cellfor screening the ligands, and provides the ligands screened by the TAPEmethod, particularly VH and VL domain antibodies.

Surprisingly, it was found that the VH domain antibody screened by theTAPE method according to the present invention has high solubility andthermostability, as well as maintains high solubility andthermostability regardless of sequences of CDRs that are inserted, thatis, sequences of CDR1 to CDR3, as long as a VH domain antibody scaffold,that is, FR1 to FR4 frames are maintained.

From this point of view, the present invention provides a VH domainantibody library where randomized human-derived or combinatorial CDRsequences, that is, sequences of CDR1 to CDR3, are inserted in the VHdomain antibody scaffold, that is, FR1 to FR4 frames, screened by theTAPE method of the present invention, and a method of constructing thesame.

In addition, the present invention provides a method of screening VHdomain antibodies having binding ability to target proteins from theconstructed library.

The VH domain antibody scaffold, that is, FR1 to FR4 frames, having highsolubility and thermostability provided in the present invention haveamino acid sequences below:

Amino Acid Sequence of FR1:

X₀VQLX₁X₂X₃GX₄X₅X₆X₇X₈PGX₉SX₁₀X₁₁X₁₂X₁₃CX₁₄X₁₅X₁₆ GX₁₇X₁₈X₁₉ -Formula 1)

in Formula 1), X₀ is E or Q, X₁ is V or L, X₂ is E or Q, X₃ is S or A,X₄ is G or A, X₅ is G, M, N, V, or E, X₆ is L, V, or W, X₇ is V, K, A,or I, X₈ is Q, K, or H, X₉ is G, T, A, R, E, S, or T, X₁₀ is L, V, R, orM, X₁₁ is R or K, X₁₂ is L, I, or V, X₁₃ is S, A, or T, X₁₄ is A, E, V,R, I, K, T, or S, X₁₅ is A, G, P, V, or T, X₁₆ is S, F, or Y, X₁₇ is F,Y, R, G, or L, X₁₈ is T, A, S, N, T, P, I, N, H, or A, and X₁₉ is F, L,V, or C; Amino Acid Sequence of FR2:

WX₂₀RX₂₁X₂₂PGX₂₃GX₂₄X₂₅X₂₆X₂₇X₂₈ -Formula 2)

in Formula 2), X₂₀ is V, A, or L, X₂₁ is Q, N, R, I, K, Y, V, M, S, Q,W, F, L, V, or C, X₂₂ is A, G, K, S, V, M, or T, X₂₃ is K, Q, E, R, orT, X₂₄ is L, N, I, P, Y, T, V, W, A, R, M, or S, X₂₅ is V or E, X₂₆ isW, I, V, P, F, H, M, Y, L, C, or R, X₂₇ is V, M, I, or L, and X₂₈ is S,A, or G; Amino Acid Sequence of FR3:

X₂₉X₃₀X₃₁X₃₂X₃₃X₃₄X₃₅X₃₆X₃₇X₃₈X₃₉X₄₀X₄₁X₄₂X₄₃X₄₄X₄₅X₄₆X₄₇X₄₈X₄₉X₅₀X₅₁DX₅₂X₅₃X₅₄YX₅₅C X₅₆X₅₇ -Formula 3)

in Formula 3), X₂₉ is R, H, Q, or T, X₃₀ is F, V, L, or I, X₃₁ is T, S,or I, X₃₂ is I, L, V, M, or R, X₃₃ is S, T, or D, X₃₄ is R, A, V, N, orI, X₃₅ is D, N, or A, X₃₆ is N, T, D, I, R, K, Y, or E, X₃₇ is A, S, V,or T, X₃₈ is K, R, T, Q, V, E, M, N, or I, X₃₉ is N, R, T, K, S, D, orV, X₄₀ is T, M, S, V, I, Y, or A, X₄₁ is L, V, A, or M, X₄₂ is F, Y, N,D, H, or S, X₄₃ is L or M, X₄₄ is Q, E, H, or N, X₄₅ is M, L, V, I, orW, X₄₆ is N, T, K, D, Y, I, or S, X₄₇ is S or N, X₄₈ is L or V, X₄₉ isR, K, or T, X₅₀ is D, A, S, P, T, V, I, or S, X₅₁ is E, A, D, or S, X₅₂is T, N, or S, X₅₃ is S, A, or G, X₅₄ is V, I, L, or M, X₅₅ is Y or F,X₅₆ is A, G, V, or S, and X₅₇ is R, S, K, T, L, N, or F; and Amino AcidSequence of FR4:

X₅₈GX₅₉GX₆₀X₆₁VTVSS -Formula 4)

in Formula 4), X₅₈ is W, C, Y, G, S, or A, X₅₉ is Q, R, or L, X₆₀ is A,T, I, or V, and X₆₁ is L, M, P, V, or T.

Also, the present invention provides a polynucleotide encoding an aminoacid sequence of the VH domain antibody scaffold, that is, amino acidsequences of FR1 to FR4 frames.

More preferably, the VH domain antibody scaffold, that is, FR1 to FR4frames, having high solubility and thermostability provided in thepresent invention have amino acid sequences below:

Amino Acid Sequence of FR1:

X₀VQLX₁X₂SGGX₅X₆X₇X₈PGX₉SX₁₀RX₁₂SCX₁₄X₁₅ SGX₁₇X₁₈X₁₉ Formula 5)

in Formula 5), X₀ is E or Q, X₁ is V or L, X₂ is E or Q, X₅ is G, N, V,or E, X₆ is L or V, X₇ is V or K, X₈ is Q, K, or H, X₉ is G, T, A, R, E,or T, X₁₀ is L or V, X₁₂ is L or V, X₁₄ is A, E, V, I, K, or S, X₁₅ isA, G, or V, X₁₇ is F, Y, R, G, or L, X₁₈ is T, A, S, N, T, P, I, N, H,or A, and X₁₉ is F, L, V, or C; Amino Acid Sequence of FR2:

WVRX₂₁X₂₂PGX₂₃GX₂₄X₂₅X₂₆X₂₇X₂₈ Formula 6)

in Formula 6), X₂₁ is Q, N, R, I, K, Y, V, M, S, Q, W, F, L, V, or C,X₂₂ is A, G, K, S, or M, X₂₃ is K, Q, E, R, or T, X₂₄ is L, N, I, P, Y,T, V, W, A, R, M, or S, X₂₅ is V or E, X₂₆ is W, I, V, P, F, H, M, Y, L,C, or R, X₂₇ is V, M, I, or L, and X₂₈ is S, A, or G; Amino AcidSequence of FR3:

RX₃₀TX₃₂SX₃₄DX₃₆X₃₇X₃₈X₃₉X₄₀X₄₁X₄₂X₄₃X₄₄X₄₅X₄₆X₄₇X₄₈X₄₉X₅₀X₅₁DTAX₅₄YX₅₅CX₅₆X₅₇ Formula 7)

in Formula 7), X₃₀ is F, V, L, or I, X₃₂ is I, L, V, or M, X₃₄ is R, A,V, or I, X₃₆ is N, T, D, I, R, K, Y, or E, X₃₇ is A, S, V, or T, X₃₈ isK, R, T, Q, V, E, M, N, or I, X₃₉ is N, R, T, K, S, D, or V, X₄₀ is T,M, S, V, I, Y, or A, X₄₄ is L, V, A, or M, X₄₂ is F, Y, N, D, H, or SX₄₃ is L or M, X₄₄ is Q, E, H, or N, X₄₅ is M, L, V, I, or W, X₄₆ is N,T, K, D, Y, I, or S, X₄₇ is S or N, X₄₈ is L or V, X₄₉ is R, K, or T,X₅₀ is D, A, S, P, T, V, I, or S, X₅₁ is E, A, D, or S, X₅₄ is V, I, L,or M, X₅₅ is Y or F, X₅₆ is A, G, V, or S, and X₅₇ is R, S, K, T, L, N,or F; and Amino Acid Sequence of FR4:

X₅₈GQGX₆₀X₆₁VTVSS -Formula 8)

in Formula 8), X₅₈ is W, C, Y, G, S, or A, X₆₀ is A, T, I, or V, and X₆₁is L, M, V, or T.

Also, the present invention provides a polynucleotide encoding an aminoacid sequence of the heavy chain variable domain (VH) antibody scaffold,that is, amino acid sequences of FR1 to FR4 frames.

More preferably, the VH domain antibody scaffold, that is, FR1 to FR4frames, having high solubility and thermostability provided in thepresent invention have amino acid sequences described in Table 1.

TABLE 1 Amino acid sequences of FR1 to FR4 frames of VH antibodyscaffold screened by TAPE (derived from human germline) Scaffold nameFR1 FR2 FR3 FR4 MG1X8 QVQLVESGGGLVQPGGSL WVRQAPGKG RFTISRDNAKNTLFLQMNWGQGALVTVS RLSCAASGFTF LVWVS SLRDEDTSVYYCAR S MG2X1 EVQLVESGGGLVQPGGSLWVRQAPGKG RFTISRDNSKNTLYLQMN WGQGTLVTVS RLSCAASGFTF LEWVS SLRAEDTAVYYCASS MG2X1-34 QVQLVESGGNVVQPGTSL WVRQAPGKG RFTISRDNSRNTVFLQMT WGQGILVTVSRLSCAASGFTF LEWLA SLRAEDTAVYYCGR S MG2X2-12 QVQLVQSGAEVKKPGASV WVRQAPGQGRVTLTRDTSTRTVYMELK WGQGTLVTVS KISCEASGYAF LEWMG NLRSADTGVYYCAR SMG2X2-13 EVQLLESGGGVVQPGKSL WVRQAPGKG RFTISRDNSKTMVNLQMN WGQGTLVTVSRLSCVGSGFSF LEWLA SLRPDDTAVYFCAR S MG3X1 QVQLVESGGGVVQPGRSL WLRQAPGKGRFTISRDNSKNTLYLEMN CGQGTLVTVS RLSCVASGFNF LEWLA SLRPEDTAVYYCAK S MG3X10EVQLVESGGGLVKPGGSL WVRQAPGKG RFTISRDDSKNMVYLQMN YGQGTLVTVS RVSCAASGFTFLEWMG SLKTEDTAVYYCTT S MG4X1-8 EVQLVESGGGLVQPGGSL WVRQGPGEGRFTISRDNAKNTVYLEMN WGQGALVTVS RLSCAASGFSF LVWLS SVRVDDTAVYYCVS SMG4X1-33 QVQLVESGGGLVQPGGSL WVRQAPGKG RFTISRDDSTNTLYLQVN WGRGTLVTVSRLSCEASGFPF LEWIS SLRAEDTAVYYCAK S MG4X1-35 EVQLLESGGGLVKPGGSL WVRQAPGKGRFTVSRDNVQKSLDLQMD WGQGTTVTVS RLSCVGSERSF LEWVA SLRAEDTAVYFCAR SMG4X3-27 EVQLLESGGGLAQSGGSL WVRQAPGKG RFTISRDIAKNSLYLQMN WGQGALVTVSRLSCAASGFTF LEWIS SLRDEDTAVYYCAK S MG4X4-2 EVQLVQSGAEVKKPGESL WARDKPGKGHVTISSDRSVSVAYLQWD WGQGTLVTVS RISCRGSGYRF LEWIG SLKASDNGIYYCAL S MG4X4-4EVQLVESGGGLVQPGGSL WVRQAPGKG RFTISRDNAEDTLFLQMN WGQGVLVTVS RLSCVPSGFTFLVWLS SLRVDDTAVYYCVR S MG4X4-25 QVQLVESGGGLVQPGGSL WVRRSPGKGRFTVSRDNAKNSLFLQMN WGQGTMVTVS RLSCIASGFSL LEWVA NVRPEDTALYFCAR SMG4X4-44 EVQLVESGGGLVQPGGSL WVRQAPGKG RFTISRDNAKNSLYLQMN WGQGTLVTVSRLSCAASGFTF LEWVA SLRAEDTALYYCAR S MG4X5-30 EVQLLESGGGLVQPGGSL WVRQAPGKGRFTISRNNAKNSLYLQMN WGQGTLVTVS RLSCAASGFTF LEWLS SLRVDDTAVYYCAR SMG4X6-27 EVQLLESGGGLVQPGGSL WVRQGPGKG RFTISRDNAENSLYLQVN WGQGALVTVSRLSCAASGFTF LEWVA SLRAEDTAIYYCAK S MG4X6-48 EVQLLESGGGVVQPGRSL WVRQAPGRRRFTISRDIATNRLYLQMR WGQGTLVTVS RLSCEVFGFTL LEWVA SLRAEDTALYYCAR SMG4X7-15 EVQLLESGGGLVQPGGSL WVRQAPGKG RFTISRDNSKNTLYLQMN WGQGTTVTVSRLSCAASGFSF LEWIS SLRVEDTAVYYCAV S MG4X8-24 EVQLLESGGGLVQPGGSL WVRQAPGKGRFTISRDNSNNTLYLQMN WGQGTLVTVS RLSCAASGFTF LEWIS SLRADDTAVYFCAK SMG0.5X-1 QVQLVESGGGLVQPGGSL WVRQVPGKG RFTISRDNAKNSLYLQMN WGQGTLVTVSRLSCAASGFTF LEWVA SLRAEDTAVYYCAN S MG0.5X-3 QVQLVESGGGLVQPGGSL WVRQAPGTGRFTISRDNSKNTLYLQMN WGXGTMVTVS TLSCAASGFTF LLWLS SLRAEDTAVYYCAR XMG0.5X-4 EVQLLESGGMLVKPGESL WVRHAPGKG RLSISRDDSMNTVYLDIY WGQGTPVTVSRLSCVGSGLIF LEWMG NLKIDDTGVYYCTF S MG0.5X-14 EVQLLESGGGLVHAGGSVWVRQAPGKG RFTISRDNSKNSMYLQMN WGQGTVVTVS RLSCAASGFTF LEWVA SLRVEDTAVYYCARS MG0.75X-4 QVQLVESGGGLVKPGGSL WLRQAPGKG RFIISRDDSNDMLYLEMI GSQGTLVTVSRLSCAASGFTF PEYVA SLKSEDTAVYYCSD S MG2X-5 EVQLLESGGGLVQPGGSL WVRQAPGKGRFTISRDNSKNTLYLHMN WGQGTLVTVS RLSCAASGFTF LEWVS SLRAEDTAVYYCVK S MG2X-15QVQLVESGGGLVQPGGSL WVRQAPGKG RFTISRDNSKNTLYLQMN WGQGTLVTVS RLSCAASGFTFLEWVS SLRAEDTAVYYCAK S MG4X-5 QVQLVESGGGLVQPGGSL WVRQAPGKGRFTVSRDNSRNTLYLQMK WGQGTMVTVS RLSCEASGLHF LEWVA SLSAEDTAVYYCAK S MG1-4QVQLVEAGGGLVQPGGSL WVRQAPGKG RFTISRDNSQNSLFLQMN WGQGTMVTVS RLACAASGFTFLEWIS SLRAEDTAVYYCAT S MG1-6 EVQLVQSGAEVKKPGESL WVRQMPGKGHVTISVDKSISTAYLQWS WGQGTLVTVS RKSCKGSGYSF LEWMG SLKASDSAMYYFL S MG1-7QVQLVESGGGLVQPGGSL WVRQAPGKG RFTISRDNAKNSLYLQMN WGQGTLVTVS RLSCAASGFTFLEWVA SLRDEDTAVYYCAR S MG1-8 EVQLVQSGAEVKKPGASV WVRQAPGQGRVTMTRDTSSTTAYMELN WGQGTLVTVS KVSCKASGYTF LEWMG RLTSDDTAVYFCAR S MG1-9EVQLVEAGGGLVQPGGSL WVRQAPGKG RFTISRDNAQNSLFLQMN WGQGTMVTVS RLSCAASGFTFLEWIS SLRAEDTAVYYCAT S MG1-10 EVQLVQSGAEVKKPGESL WVRQMPGRGQVTMSANRSISTAYLQWS WGQGTTVTVS KISCKGSGYSF LEWLG SLKASDTGIYYCAT S MG5-1QVQLVESGGGLIQPGESL WVRQAPGKG RFTISRDSTQNTVHLQMN WGQGTLVTVS RLSCEAFGFTVLEWVS SLTAEDTAVYYCAR S MG5-2 EVQLVQSGAELKKPGSSV WVRQAPGQGRLILSVDEPTRTVYMELT WGQGTTVTVS KVSCTSSGGSF LEWMG SLRSDDTAMYYCAR S MG5-4EVQLLESGGGLVQPGRSL WVRQAPGKG RFTISRDNAKDSLYLQMN WGQGTMVTVS RLSCAASGFTFLEWVS SLRPEDTALYYCAR S MG5-5 EVQLLESGGGVVQPGRSL WVRQAPGKGRFTISRDYSNKIVHLEMD WGQGTLVTVS RLSCVASGFTF LEWVS SLRAEDTAVYFCVR S MG5-6EVQLLESGGGLVKPGGSL WVRQAPGKG RFTISRDDSRDMLYLQMN SSQGTLVTVS RLSCAASGFTFLECVA NLKTEDTAVYYCSD S MG5-7 EVQLVESGGGLVQPGRSL WVRQAPGKGRFTISRDDSKSIVYLQMS WGRGTLVTVS RLSCTTSGFSF LEWVS SLQTEDTAVYYCSR S MG5-9EVQLLESGGGLVRPGGSL WVRQAPGKG TISRDNAKNSVYLQMNSL WGQGTLVTVS RLSCSASGFAFLEWVS RAEDSAVYFCAR S MG10-1 QVQLVESGGNVVQPGTSL WVRQAPGKGRFTISRDNSRNTVFLQMT WGQGILVTVS RLSCAASGFTF LEWVA SLRAEDTAVYYCGR S MG10-2EVQLLESGGGLVQPGGSL WVRQAPGKG RFTISRDNAKDSLYLQMD APQGTLVTVS RLTCVGYGFTFPEWVA SLRPEDTAVYYCAR S MG10-4 EVQLLESGGGLVQPGGSL WVRQAPGKGQFTISRDNAKNTLYLQMN WGQGTMVTVS RLSCAASGFIL LVWVS SLRVEDTAVYYCAR S MG10-5EVQLLESGGGVVHPGRSL WVRQAPDKG RFTVSRDISKNTVYLQMN WGQGTMVTVS RLSCAVSGFSLLEWLA SLRAEDTALYYCAR S MG10-6 EVQLLESGGGLVQPGGSR WFRQGPGKGRFTISRDDSKNSLSLQMD WGQGTVVTVS RLSCAASGFTF LEWLA SLRTEDTAVYYCVR S MG10-8QVQLVESGGGVVQPGRSL WVRQTPGRG RFTISRDNSNNTVYLEMN WGLGTVVTVS RLSCVASGFAFLEWLA SLRPEDSAIYYCAK S MG10-10 QVQLVESGGVVVQPGGSL WVRQAPGKGRFTISRDNSKNSLYLQMN WGQGTLVTVS RLSCAASGFTF LEWVS SLRTDETALYYCV S MG2EVQLLESGGGLVQPGGSL WVRQAPGKG RFTISRDNAKNSLYLQMN WGQGTTVTVS RLSCAASGFTFLEWVS SLRTDETAVYYCAR S M5G EVQLLQSGGGWVKPGGSL WVRQAPGKGRFTISIDESRNALFLHMN WGQGTLVTVS RLSCAASGFIC LEWMG SLTTDDTAVYYCST S MG6EVQLLESGGVVVQPGRSL WVRQAPGKG RFTVSRDTSTNTLYLQMN WGQGTLVTVS RLSCAASGFTFLEWLA SLRVEDTAVYYCAR S MG7 QMQLVQSEAEVKKPGASM WVRQATGQGRVTMTRNTSISTAYMELS WGQGTLVTVS KVSCKASGYTF LEWMG SLTSADTAVYYCAR S MG10QVQLVQSGAEVKKPGESL WVRQMPGKG QVTISADKSISTAFLQWN WGLGTLVTVS KISCKGSGYSFLEWMG SLKASDTAMYYCAR S

Also, the present invention provides a polynucleotide encoding an aminoacid sequence of the VH domain antibody scaffold, that is, amino acidsequences of FR1 to FR4 frames.

In particular, a VH domain antibody scaffold, that is, FR1 to FR4frames, which are improved through modification of a part of the aminoacid sequence of the frame, have amino acid sequences described in Table2.

TABLE 2 Amino acid sequences of FR1 to FR4 frames of amino acid-modified VH domain antibody scaffold Scaffold name FR1 FR2 FR3 FR4MG8-21 EVQLVESGGGLVQPGG WVRNAPGKGN RFTISRDNSKNTLYLQMN WGQGTLVTVSSSLRLSCAASGFTF EIVS SLRAEDTAVYYCAS MG2-12L EVQLVESGGGLVQPGG WVRRAPGKGIRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EVVS SLRAEDTAVYYCAS MG2-7IEVQLVESGGGLVQPGG WVRIAPGKGP RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFEPVS SLRAEDTAVYYCAS MG2-9I EVQLVESGGGLVQPGG WVRKAPGKGYRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EPVS SLRAEDTAVYYCAS MG2-10IEVQLVESGGGLVQPGG WVRNAPGKGY RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFEIVS SLRAEDTAVYYCAS MG2-11I EVQLVESGGGLVQPGG WVRYAPGKGYRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EFVS SLRAEDTAVYYCAS MG2-12IEVQLVESGGGLVQPGG WVRVAPGKGI RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFEPVS SLRAEDTAVYYCAS MG2-32 EVQLVESGGGLVQPGG WVRMAPGKGPRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EHVS SLRAEDTAVYYCAS MG2-34EVQLVESGGGLVQPGG WVRSAPGKGV RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFEMVS SLRAEDTAVYYCAS MG2-40 EVQLVESGGGLVQPGG WVRTAPGKGTRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EMVS SLRAEDTAVYYCAS MG2-46EVQLVESGGGLVQPGG WVRCAPGKGY RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFEFVS SLRAEDTAVYYCAS MG2-47 EVQLVESGGGLVQPGG WVRIAPGKGLRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EMVS SLRAEDTAVYYCAS MG2-48EVQLVESGGGLVQPGG WVRMAPGKGL RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFEYVS SLRAEDTAVYYCAS MG2-51 EVQLVESGGGLVQPGG WVRYAPGKGTRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EFVS SLRAEDTAVYYCAS MG2-53EVQLVESGGGLVQPGG WVRQAPGKGV RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFEWVS SLRAEDTAVYYCAS MG2-55 EVQLVESGGGLVQPGG WVRWAPGKGPRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EFVS SLRAEDTAVYYCAS MG2-57EVQLVESGGGLVQPGG WVRFAPGKGR RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFEWVS SLRAEDTAVYYCAS MG2-58 EVQLVESGGGLVQPGG WVRFAPGKGCRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF ELVS SLRAEDTAVYYCAS MG2-59EVQLVESGGGLVQPGG WVRKAPGKGL RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFETVS SLRAEDTAVYYCAS MG2-60 EVQLVESGGGLVQPGG WVRNAPGKGLRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF ECVS SLRAEDTAVYYCAS MG2-64EVQLVESGGGLVQPGG WVRCAPGKGW RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFEVVS SLRAEDTAVYYCAS MG4-12 EVQLVESGGGLVQPGG WVRLAPGKGVRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF ELVS SLRAEDTAVYYCAS MG4-13EVQLVESGGGLVQPGG WVRFAPGKGA RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFEWVS SLRAEDTAVYYCAS MG4-17 EVQLVESGGGLVQPGG WVRLAPGKGRRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EWVS SLRAEDTAVYYCAS MG4-18EVQLVESGGGLVQPGG WVRYAPGKGV RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFEFVS SLRAEDTAVYYCAS MG4-20 EVQLVESGGGLVQPGG WVRFAPGKGLRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EMVS SLRAEDTAVYYCAS MG4-28EVQLVESGGGLVQPGG WVRVAPGKGT RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFERVS SLRAEDTAVYYCAS MG4-2 EVQLVESGGGLVQPGG WVRIAPGKGM RFTISRDNSKNTLYLQMNWGQGTLVTVSS SLRLSCAASGFTF EMVS SLRAEDTAVYYCAS MG4-32 EVQLVESGGGLVQPGGWVRAAPGKGP RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF ELVSSLRAEDTAVYYCAS MG4-33 EVQLVESGGGLVQPGG WVRVAPGKGY RFTISRDNSKNTLYLQMNWGQGTLVTVSS SLRLSCAASGFTF EHVS SLRAEDTAVYYCAS MG4-34 EVQLVESGGGLVQPGGWVRVAPGKGL RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF ECVSSLRAEDTAVYYCAS MG4-5 EVQLVESGGGLVQPGG WVRVAPGKGP RFTISRDNSKNTLYLQMNWGQGTLVTVSS SLRLSCAASGFTF ETVS SLRAEDTAVYYCAS MG4-6 EVQLVESGGGLVQPGGWVRMAPGKGS RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EVVSSLRAEDTAVYYCAS MG4-7 EVQLVESGGGLVQPGG WVRLAPGKGT RFTISRDNSKNTLYLQMNWGQGTLVTVSS SLRLSCAASGFTF EMVS SLRAEDTAVYYCAS MG8-11 EVQLVESGGGLVQPGGWVRTAPGKGA RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EWVSSLRAEDTAVYYCAS MG8-12 EVQLVESGGGLVQPGG WVRWAPGKGK RFTISRDNSKNTLYLQMNWGQGTLVTVSS SLRLSCAASGFTF EVVS SLRAEDTAVYYCAS MG8-13 EVQLVESGGGLVQPGGWVRQAPGKGI RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EPVSSLRAEDTAVYYCAS MG8-14 EVQLVESGGGLVQPGG WVRQAPGKGP RFTISRDNSKNTLYLQMNWGQGTLVTVSS SLRLSCAASGFTF EWVS SLRAEDTAVYYCAS MG8-4 EVQLVESGGGLVQPGGWVRQAPGKGP RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EVVSSLRAEDTAVYYCAS MG8-5 EVQLVESGGGLVQPGG WVRTAPGKGI RFTISRDNSKNTLYLQMNWGQGTLVTVSS SLRLSCAASGFTF EIVS SLRAEDTAVYYCAS MG8-6 EVQLVESGGGLVQPGGWVRIAPGKGV RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EIVSSLRAEDTAVYYCAS MG8-8 EVQLVESGGGLVQPGG WVRAAPGKGL RFTISRDNSKNTLYLQMNWGQGTLVTVSS SLRLSCAASGFTF EVVS SLRAEDTAVYYCAS

The VH domain including the amino acid sequence of the VH antibodyscaffold, that is, the amino acid sequences of FR1 to FR4 frames,according to the present invention has an amino acid sequencerepresented by

FR1-X-FR2-X-FR3-X-FR4 Formula 9),in Formula 9), X means CDR1, CDR2, and CDR3, in order from the leftside.

Specifically, the VH domain including the amino acid sequences of FR1 toFR4 frames according to the present invention has one selected fromamino acid sequences of SEQ ID NOs: 37 to 89, and SEQ ID NOs: 90 to 131as shown in Tables 3 and 4.

TABLE 3 Amino acid sequences of VH domain including FR1 to FR4frames of the VH domain antibody scaffold screened byTAPE (derived from human germline) SEQ Amino acid sequence of VH regionScaffold ID (X means CDR1, CDR2, and CDR3, name NOin order from the left side) MG1X8  49QVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRQAPGKGLVWVS-X-RFTISRDNAKNTLFLQMNSLRDEDTSVYYCAR-X-WGQGALVTVSS MG2X1  50EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRQAPGKGLEWVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG2X1-34  51QVQLVESGGNVVQPGTSLRLSCAASGFTF-X-WVRQAPGKGLEWVA-X-RFTISRDNSRNIVFLQMTSLRAEDTAVYYCGR-X-WGQGILVTVSS MG2X2-12  52QVQLVQSGAEVKKPGASVKISCEASGYAF-X-WVRQAPGQGLEWMG-X-RVTLTRDTSTRTVYMELKNLRSADTGVYYCAR-X-WGQGTLVTVSS MG2X2-13  53EVQLLESGGGVVQPGKSLRLSCVGSGFSF-X-WVRQAPGKGLEWLA-X-RFTISRDNSKTMVNLQMNSLRPDDTAVYFCAR-X-WGQGTLVTVSS MG3X1  54QVQLVESGGGVVQPGRSLRLSCVASGFNF-X-WLRQAPGKGLEWVA-X-RFTISRDNSKNTLYLEMNSLRPEDTAVYYCAK-X-CGQGTLVTVSS MG3X10  55EVQLVESGGGLVKPGGSLRVSCAASGFTF-X-WVRQAPGKGLEWVG-X-RFTISRDDSKNMVYLQMNSLKTEDTAVYYCTT-X-YGQGTLVTVSS MG4X1-8  56EVQLVESGGGLVQPGGSLRLSCAASGFSF-X-WVRQGPGEGLVWLS-X-RFTISRDNAKNTVYLEMNSVRVDDTAVYYCVS-X-WGQGALVTVSS MG4X1-33  57QVQLVESGGGLVQPGGSLRLSCEASGFPF-X-WVRQAPGKGLEWVS-X-RFTISRDDSTNTLYLQVNSLRAEDTAVYYCAK-X-WGRGTLVTVSS MG4X1-35  58EVQLLESGGGLVKPGGSLRLSCVGSERSF-X-WVRQAPGKGLEWVA-X-RFTVSRDNVQKSLDLQMDSLRAEDTAVYFCAR-X-WGQGTTVTVSS MG4X3-27  59EVQLLESGGGLAQSGGSLRLSCAASGFTF-X-WVRQAPGKGLEWIS-X-RFTISRDIAKNSLYLQMNSLRDEDTAVYYCAK-X-WGQGALVTVSS MG4X4-2  60EVQLVQSGAEVKKPGESLRISCRGSGYRF-X-WARDKPGKGLEWIG-X-HVTISSDRSVSVAYLQWDSLKASDNGIYYCAL-X-WGQGTLVTVSS MG4X4-4  61EVQLVESGGGLVQPGGSLRLSCVPSGFTF-X-WVRQAPGKGLVWVS-X-RFTISRDNAEDTLFLQMNSLRVDDTAVYYCVR-X-WGQGVLVTVSS MG4X4-25  62QVQLVESGGGLVQPGGSLRLSCIASGFSL-X-WVRRSPGKGLEWVA-X-RFTVSRDNAKNSLFLQMNNVRPEDTALYFCAR-X-WGQGTMVTVSS MG4X4-44  63EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRQAPGKGLEWVA-X-RFTISRDNAKNSLYLQMNSLRAEDTALYYCAR-X-WGQGTLVTVSS MG4X5-30  64EVQLLESGGGLVQPGGSLRLSCAASGFTF-X-WVRQAPGKGLEWLS-X-RFTISRNNAKNSLYLQMNSLRVDDTAVYYCAR-X-WGQGTLVTVSS MG4X6-27  65EVQLLESGGGLVQPGGSLRLSCAASGFTF-X-WVRQGPGKGLEWVA-X-RFTISRDNAENSLYLQVNSLRAEDTAIYYCAK-X-WGQGALVTVSS MG4X6-48  66EVQLLESGGGVVQPGRSLRLSCEVFGFTL-X-WVRQAPGRRLEWVA-X-RFTISRDIATNRLYLQMRSLRAEDTALYYCAR-X-WGQGTLVTVSS MG4X7-15  67EVQLLESGGGLVQPGGSLRLSCAASGFSF-X-WVRQAPGKGLEWVS-X-RFTISRDNSKNTLYLQMNSLRVEDTAVYYCAV-X-WGQGTTVTVSS MG4X8-24  68EVQLLESGGGLVQPGGSLRLSCAASGFTF-X-WVRQAPGKGLEWVS-X-RFTISRDNSNNTLYLQMNSLRADDTAVYFCAK-X-WGQGTLVTVSS MG0.5X-1  69QVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRQVPGKGLEWVA-X-RFTISRDNAKNSLYLQMNSLRAEDTAVYYCAN-X-WGQGTLVTVSS MG0.5X-3  70QVQLVESGGGLVQPGGSLTLSCAASGFTF-X-WVRQAPGTGLLWLS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAR-X-WGXGTMVTVSX MG0.5X-4  71EVQLLESGGMLVKPGESLRLSCVGSGLIF-X-WVRHAPGKGLEWVG-X-RLSISRDDSMNTVYLDIYNLKIDDTGVYYCTF-X-WGQGTPVTVSS MG0.5X-14  72EVQLLESGGGLVHAGGSVRLSCAASGFTF-X-WVRQAPGKGLEWVA-X-RFTISRDNSKNSMYLQMNSLRVEDTAVYYCAR-X-WGQGTVVTVSS MG0.75X-4  73QVQLVESGGGLVKPGGSLRLSCAASGFTF-X-WLRQAPGKGPEYVA-X-RFIISRDDSNDMLYLEMISLKSEDTAVYYCSD-X-GSQGTLVTVSS MG2X-5  74EVQLLESGGGLVQPGGSLRLSCAASGFTF-X-WVRQAPGKGLEWVS-X-RFTISRDNSKNTLYLHMNSLRAEDTAVYYCVK-X-WGQGTLVTVSS MG2X-15  75QVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRQAPGKGLEWVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAK-X-WGQGTLVTVSS MG4X-5  76QVQLVESGGGLVQPGGSLRLSCEASGLHF-X-WVRQAPGKGLEWVA-X-RFTVSRDNSRNTLYLQMKSLSAEDTAVYYCAK-X-WGQGTMVTVSS MG1-4  77QVQLVEAGGGLVQPGGSLRLACAASGFTF-X-WVRQAPGKGLEWIS-X-RFTISRDNSQNSLFLQMNSLRAEDTAVYYCAT-X-WGQGTMVTVSS MG1-6  78EVQLVQSGAEVKKPGESLRKSCKGSGYSF-X-WVRQMPGKGLEWMG-X-HVTISVDKSISTAYLQWSSLKASDSAMYYFL-X-WGQGTLVTVSS MG1-7  79QVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRQAPGKGLEWVA-X-RFTISRDNAKNSLYLQMNSLRDEDTAVYYCAR-X-WGQGTLVTVSS MG1-8  80EVQLVQSGAEVKKPGASVKVSCKASGYTF-X-WVRQAPGQGLEWMG-X-RVTMTRDTSSTTAYMELNRLTSDDTAVYFCAR-X-WGQGTLVTVSS MG1-9  81EVQLVEAGGGLVQPGGSLRLACAASGFTF-X-WVRQAPGKGLEWIS-X-RFTISRDNAQNSLFLQMNSLRAEDTAVYYCAT-X-WGQGTMVTVSS MG1-10  82EVQLVQSGAEVKKPGESLKISCKGSGYSF-X-WVRQMPGRGLEWLG-X-QVTMSANRSISTAYLQWSSLKASDTGIYYCAT-X-WGQGTTVTVSS MG5-1  83QVQLVESGGGLIQPGESLRLSCEAFGFTV-X-WVRQAPGKGLEWVS-X-RFTISRDSTQNTVHLQMNSLTAEDTAVYYCAR-X-WGQGTLVTVSS MG5-2  84EVQLVQSGAELKKPGSSVKVSCTSSGGSF-X-WVRQAPGQGLEWMG-X-RLILSVDEPTRTVYMELTSLRSDDTAMYYCAR-X-WGQGTTVTVSS MG5-4  85EVQLLESGGGLVQPGRSLRLSCAASGFTF-X-WVRQAPGKGLEWVS-X-RFTISRDNAKDSLYLQMNSLRPEDTALYYCAR-X-WGQGTMVTVSS MG5-5  86EVQLLESGGGVVQPGRSLRLSCVASGFTF-X-WVRQAPGKGLEWVS-X-RFTISRDYSNKIVHLEMDSLRAEDTAVYFCVR-X-WGQGTLVTVSS MG5-6  87EVQLLESGGGLVKPGGSLRLSCAASGFTF-X-WVRQAPGKGLECVA-X-RFTISRDDSRDMLYLQMNNLKTEDTAVYYCSD-X-SSQGTLVTVSS MG5-7  88EVQLVESGGGLVQPGRSLRLSCTTSGFSF-X-WVRQAPGKGLEWVS-X-RFTISRDDSKSIVYLQMSSLQTEDTAVYYCSR-X-WGRGTLVTVSS MG5-9  89EVQLLESGGGLVRPGGSLRLSCSASGFAF-X-WVRQAPGKGLEWVS-X-TISRDNAKNSVYLQMNSLRAEDSAVYFCAR-X-WGQGTLVTVSS MG10-1  90QVQLVESGGNVVQPGTSLRLSCAASGFTF-X-WVRQAPGKGLEWVA-X-RFTISRDNSRNTVFLQMTSLRAEDTAVYYCGR-X-WGQGILVTVSS MG10-2  91EVQLLESGGGLVQPGGSLRLTCVGYGFTF-X-WVRQAPGKGPEWVA-X-RFTISRDNAKDSLYLQMDSLRPEDTAVYYCAR-X-APQGTLVTVSS MG10-4  92EVQLLESGGGLVQPGGSLRLSCAASGFIL-X-WVRQAPGKGLVWVS-X-QFTISRDNAKNTLYLQMNSLRVEDTAVYYCAR-X-WGQGTMVTVSS MG10-5  93EVQLLESGGGVVHPGRSLRLSCAVSGFSL-X-WVRQAPDKGLEWVA-X-RFTVSRDISKNTVYLQMNSLRAEDTALYYCAR-X-WGQGTMVTVSS MG10-6  94EVQLLESGGGLVQPGGSRRLSCAASGFTF-X-WFRQGPGKGLEWVA-X-RFTISRDDSKNSLSLQMDSLRTEDTAVYYCVR-X-WGQGTVVTVSS MG10-8  95QVQLVESGGGVVQPGRSLRLSCVASGFAF-X-WVRQTPGRGLEWLA-X-RFTISRDNSNNTVYLEMNSLRPEDSAIYYCAK-X-WGLGTVVTVSS MG10-10  96QVQLVESGGVVVQPGGSLRLSCAASGFTF-X-WVRQAPGKGLEWVS-X-RFTISRDNSKNSLYLQMNSLRTDETALYYCV-X-WGQGTLVTVSS MG2  97EVQLLESGGGLVQPGGSLRLSCAASGFTF-X-WVRQAPGKGLEWVS-X-RFTISRDNAKNSLYLQMNSLRTDETAVYYCAR-X-WGQGTTVTVSS MG5  98EVQLLQSGGGWVKPGGSLRLSCAASGFIC-X-WVRQAPGKGLEWVG-X-RFTISIDESRNALFLHMNSLTTDDTAVYYCST-X-WGQGTLVTVSS MG6  99EVQLLESGGVVVQPGRSLRLSCAASGFTF-X-WVRQAPGKGLEWVA-X-RFTVSRDTSTNTLYLQMNSLRVEDTAVYYCAR-X-WGQGTLVTVSS MG7 100QMQLVQSEAEVKKPGASMKVSCKASGYTF-X-WVRQATGQGLEWMG-X-RVTMTRNTSISTAYMELSSLTSADTAVYYCAR-X-WGQGTLVTVSS MG10 101QVQLVQSGAEVKKPGESLKISCKGSGYSF-X-WVRQMPGKGLEWMG-X-QVTISADKSISTAFLQWNSLKASDTAMYYCAR-X-WGLGTLVTVSS

TABLE 4 Amino acid sequences of VH domain including FR1 to FR4frames of amino acid-modified VH domain antibody scaffold SEQAmino acid sequence of VH region Scaffold ID(X means CDR1, CDR2, and CDR3,   name NO in order from the left side)MG8-21 102 EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRNAPGKGNEIVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG2-12L 103EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRRAPGKGIEVVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG2-7I 104EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRIAPGKGPEPVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG2-9I 105EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRKAPGKGYEPVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG2-10I 106EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRNAPGKGYEIVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG2-11I 107EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRYAPGKGYEFVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG2-12I 108EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRVAPGKGIEPVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG2-32 109EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRMAPGKGPEHVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG2-34 110EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRSAPGKGVEMVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG2-40 111EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRTAPGKGTEMVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG2-46 112EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRCAPGKGYEFVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG2-47 113EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRIAPGKGLEMVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG2-48 114EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRMAPGKGLEYVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG2-51 115EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRYAPGKGTEFVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG2-53 116EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRQAPGKGVEWVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG2-55 117EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRWAPGKGPEFVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG2-57 118EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRFAPGKGREWVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG2-58 119EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRFAPGKGCELVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG2-59 120EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRKAPGKGLETVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG2-60 121EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRNAPGKGLECVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG2-64 122EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRCAPGKGWEVVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG4-12 123EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRLAPGKGVELVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG4-13 124EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRFAPGKGAEWVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG4-17 125EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRLAPGKGREWVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG4-18 126EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRYAPGKGVEFVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG4-20 127EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRFAPGKGLEMVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG4-28 128EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRVAPGKGTERVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG4-2 129EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRIAPGKGMEMVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG4-32 130EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRAAPGKGPELVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG4-33 131EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRVAPGKGYEHVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG4-34 132EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRVAPGKGLECVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG4-5 133EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRVAPGKGPETVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG4-6 134EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRMAPGKGSEVVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG4-7 135EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRLAPGKGTEMVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG8-11 136EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRTAPGKGAEWVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG8-12 137EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRWAPGKGKEVVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG8-13 138EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRQAPGKGIEPVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG8-14 139EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRQAPGKGPEWVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG8-4 140EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRQAPGKGPEVVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG8-5 141EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRTAPGKGIEIVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG8-6 142EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRIAPGKGVEIVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS MG8-8 143EVQLVESGGGLVQPGGSLRLSCAASGFTF-X-WVRAAPGKGLEVVS-X-RFTISRDNSKNTLYLQMNSLRAEDTAVYYCAS-X-WGQGTLVTVSS

Also, the present invention provides a polynucleotide encoding an aminoacid sequence of the VH domain antibody scaffold, that is, amino acidsequences of FR1 to FR4 frames.

In order that the present invention may be more readily understood,certain terms and abbreviations used herein are first defined before thepresent invention is in detail described.

Human immunoglobulin variable domain: This means heavy chain variabledomains (VH) or light chain variable domains (VL), which directlyparticipate in binding to antigen, among 12 domains (VH, CH1, CH2, CH3,VL, and CL each for one pair) constituting a structure of humanimmunoglobulin G. The VH or VL has a structure where nine beta-sheetstrands cross each other. In the case of VH, variable regions arepresent between the second and the third beta-sheets, between the fourthand fifth beta-sheets, and the eighth and the ninth beta-sheets,starting from N-terminal thereof, and here, these variable regions arereferred to as CDR (complementarity determining region)1, CDR2, andCDR3, respectively. When starting from the N-terminal of VH, an entiresection of the first and second beta-sheet excluding CDR1 is referred toas Frame 1 (FR1), a section between CDR1 and CDR2 including the thirdand fourth beta-sheets is referred to as Frame 2 (FR2), a sectionbetween CDR2 and CDR3 is referred to as Frame 3 (FR3), and a sectionafter CDR3 is referred to as Frame 4. The respective frames do notdirectly participate in binding to antigens even though they are thevariable domains, and most amino acid sequences in these regions areconsistently conserved on the human immunoglobulin. The VH segments areclassified into seven families (VH1, VH2, VH3, VH4, VH5, VH6, and VH7)according to homology of amino acid sequence in the frame regions. TheVL segment is divided into Vkappa and Vlamda, and the Vkappa segmentsare classified into six families and the Vlamda segments are classifiedinto ten families (Chothia et al., 1992 J Mol Biol 227, 799-917;Tomlinson et al, 1995 EMBO J 14, 4628-4638; Williams et al., J Mol Biol264, 220-232).

CDR (Complementarity determining Region): This may be called ahypervariable region, and it means a region which is positioned withinthe heavy chain or light chain variable domain in the structure of theantibody and participates in directly binding to epitope of antigen.Each variable domain has three CDR regions, and they consist of aminoacid sequences with various lengths.

Antibody scaffold: This means the rest of the antibody structureexcluding the hypervariable region, that is, CDR regions, which directlybind to antigen, and an amino acid sequence thereof is conserved when aCDR-engineered library is prepared. In other words, this means the restof the amino acid sequence of the antibody, excluding CDR1, CDR2, andCDR3, which are hypervariable regions. In the present invention, thismeans a region of the entire FR1 to FR4 frames, which corresponds to therest of the region excluding CDR1, CDR2, and CDR3 in each of the humanimmunoglobulin variable domains.

That is, the antibody scaffold of VH domain in the present inventionmeans the entire region of FR1 to FR4 frames, excluding CDR1 to CDR3regions in the VH domain antibody.

This antibody scaffold may be utilized as a scaffold for preparing aCDR-engineered library for screening a protein binding to a target,particularly, VH domain antibody.

Domain antibody: This means, in a broad sense, a modified proteincapable of binding a specific antigen, including some of the domainsconstituting the structure of human immunoglobulin. This means, in anarrow sense, a modified form of human immunoglobulin variable domain(VH or VL), that can be suitably used as a therapeutic antibody.

In particular, a single domain antibody generally means a variabledomain derived from a single heavy chain antibody consisting of only aheavy chain. A single domain antibody derived from dromedary is called aVHH (heavy chain variable domain from single heavy chain antibody), anda VNAR (single domain antibody derived from chondrichthyes such assharks is called a variable new antigen receptor).

The single domain antibody in the present invention, for example, a VHdomain antibody or a VL domain antibody, means an antibody consisting ofonly one heavy chain or light chain of human-derived variable domains,without particular limitations.

Target protein: This means a protein of interest which is to be selectedin order to improve protein characteristics from the library, in a TAPEmethod. In the present invention, this means a protein which is encodedby a gene functionally linked between a Tat signal sequence and a TEM-1beta lactamase gene of a pET-TAPE expression vector. Any protein thatcan require solubility may be used without limitations, and examplesthereof include human-derived antibodies and fragments thereof,receptors, receptor ligands, and the like, and particularly includenatural type of human immunoglobulin variable domains derived from humangermline cells and artificially mutated proteins thereof.

Ligand: This means a protein having binding ability to a specificreceptor or a targeting protein among target proteins screened from thelibrary.

Multi-specificity: This means property of an antibody capable ofspecifically binding to one or more epitopes. This may also meanproperty of an antibody that recognizes two or more epitopes in onetarget object, or property of an antibody that binds to two or moretarget objects.

Fusion protein: This means a protein where at least two proteins orpeptides having different functions, which are encoded by nucleic acids,are functionally linked to each other, for example, a Tat signalsequence and a target protein are functionally linked to each other.Here, a reporter gene such as a TEM-1 beta lactamase, or a tag such as6×His or Flag may be added thereto. Expression of the genes of proteinsthat are functionally linked to each other is regulated on an expressionvector by one promoter (induction type, maintenance type, or the like).

Natural type or wild type: This means a gene or a product thereof (forexample, protein) that can be obtained in a natural system. This is aconcept contrary to mutant, polymorphism, and variant, of which productshave characteristics changed due to an artificial or natural change ingene sequences.

The present invention will be described according to the abovedefinitions of the terms and abbreviations.

As described above, a TAPE method and a TAPE system therefor accordingto the present invention is a method for screening target proteins suchas a immunoglobulin variable domain derived from human germline cell (VHor VL), a ligand, or the like, having solubility and excellentthermostability, by preparing a gene construct where a target proteinand an antibiotic resistant protein are bound to a Tat signal sequence,and then transforming a hose cell, particularly, E. coli, with a vectorincluding the gene construct to express the fusion protein in E. coli.,and a system for performing the method.

The “Tat signal sequence” is a sequence recognized by a Tat(twin-arginine translocation) pathway. It leads proteins to a pathwaypassing from the cytoplasm to the intracellular membrane in bacteria,and to a pathway moving from the stroma to the thylakoid in chloroplast.Generally, the Tat signal sequence is divided into the following threemotifs. It is comprised of an n-region which is an N-terminal motifhaving positive charge, an h-region consisting of hydrophobic aminoacids in the middle, and a c-region which is a C-terminal motif. As aresult of analyzing several Tat signal sequences, it was found thatS/T-R-R-x-F-L-K, which is a distinctive conservative sequence of the Tatsignal sequence, is present throughout the n-region and the h-region.Among them, two arginines (R) are named twin-arginine because they areconserved on all the Tat signal sequences.

The Tat signal sequence may be selected from TorA, CuoO, DmsA, FdnG,FdoG, HyaA, NapA, Sufi, WcaM, TagT, YcbK, YcdB, YdhX, and YnfE, but isnot limited thereto. Proteins that have a complete three dimensionalstructure by binding with chaperons or various cofactors in thecytoplasm move to the cell membrane through the Tat signal pathway, butare not compatible with a Sec signal which is a general cell membranemovement pathway of bacteria. In other words, since only proteins thatare folded to have a complete three-dimensional structure in thecytoplasm may be recognized by a Tat ABC translocase complex, they areinvolved in a protein translocation pathway having differentcharacteristics from the Sec pathway (Baneyx and Mujacic, Nat. Biotech.2004, 22, 1399˜1408).

The TAPE method according to the present invention uses the abovecharacteristics of the Tat-signal pathway. In order that the targetprotein derived from a library (for example, a human VH domain library)passes through a TAPE screening system pathway, it needs to becompletely folded within the cytoplasm and thus recognized by the TatABC translocase complex. Therefore, this complex present in theintracellular membrane may function as a fitness filter for filteringonly substrates fitted to the Tat pathway. Accordingly, the presentinvention uses the facts that the protein generally passes through theTat pathway only when it is completely folded in the cytoplasm, which isdependent on solubility and fast folding of the protein (DeLisa et al.,2003 PNAS 100(10): 6115-6120; Snaders et al., 2001 Mol Microbiol 41(1):241-246; Matos et al., 2008 EMBO J 27(15): 2055-2063; Fisher et al.,2006 Protein Sci 15(3): 449-458; Lim et al., 2009 Protein Sci 18(12):2537-2549), as found by other studies. The Tat pathway signal sequenceusable in the present invention is preferably selected from signalsequences of TorA, CueO, DmsA, FdnG, FdoG, HyaA, NapA, Sufl, WcaM, YagT,TcbK, YcdB, YdhX, and YnfE proteins, but is not limited thereto.

In other words, the cytoplasm of E. coli, where protein folding occursdue to the nature of the Tat pathway, has a reducing condition notallowing a disulfide bond, and thus, VH domains, of which fast andaccurate folding occurs autonomously without the help of the disulfidebond, can be filtered from the VH domain antibody library. Finding outantibodies that have functions under the reducing environment and areautonomously folded is a key point in developing specific antibodies fortarget antigens existing in the cytoplasm of the reducing environment,that is, intrabodies. Physicochemical properties of human VH domains orengineered VH domains of germ cells screened by the TAPE system of thepresent invention may be determined by analyzing characteristics of thescreened VH domains. It is difficult to simply predict how the targetprotein autonomously folded within the cytoplasm contributes to certainphysicochemical properties (for example, solubility of protein,thermostability of protein, long storage stability of protein,structural stability of protein, and the like) and further as anantibody therapeutic agent. The present invention has an object ofintroducing and developing the above TAPE method and applying this toprotein technology, and more particularly, applying this TAPE method toscreen an improved human immunoglobulin variable domain antibody (VH orVL) and applying the variable domain antibody having improved propertiesobtained therefrom as a scaffold for developing novel therapeuticantibodies.

The TAPE method according to the present invention has the followingadvantages as compared with methods of the prior art.

Through a method of performing panning by applying a predeterminedstress, such as temperature, to a domain antibody library, using thephage display technology of the prior art, and then measuring bindingactivity to protein A (Jepsers L. et al., Nat. Biotechnol. 2004 22(9):1161-5; Barthelemy P. A. et al., J. Biol. Chem. 2008 283(6): 3639-54), adomain named m0 that is stable as a domain form was accidentally foundthrough experiments without particular screening procedures (J. Biol.Chem. 2009 May 22; 284(21): 14203-14210).

However, according to the present invention, human immunoglobulinvariable domains having high solubility and thermostability can beeasily screened by employing the TAPE method using a Tat pathway of E.coli.

Also, there has been known a method of finding soluble proteins from thelibrary including Tat signal sequences by using plate-based screening.However, this method have to follow a procedure where individual clonessurviving due to antibiotic resistance thereof are obtained when anexpression strain diluted with a predetermined ratio is smeared on asolid medium (plate) containing an antibiotics (Fisher A. C. et al.,Protein Sci. 2006 Mar. 15(3): 449-58, Fisher A. C. et al., J. Mol. Biol.2009 385(1): 299-311). Therefore, it is difficult to isolate individualclones of 10⁵ or more from one plate due to limitations of theplate-based screening method. Given that an antibody library forselecting general binding activities has 10⁹ to 10¹⁰, it is veryphysically difficult to cover the entire of a normal-size library byusing the above method. As the result, it is substantially difficult torealize a high throughput form in screening. In addition, when genesequences of plasmids of individual strains selected from the plate byantibiotic resistance is confirmed in most cases of using methods of theprior art (for example, ISELATE), a case where a target gene is clonedinto a fragmented form to express a short form of protein or a casewhere only a reporter gene is present without the target protein (forexample, TEM1-1 beta-lactamase only) is often found (Fisher A. C. etal., J. Mol. Biol. 2009 385(1): 299-311). These peptide-level(consisting of 10 to 20 amino acids) short proteins are not affected bytwo or three dimensional structure thereof, and thus, they themselveshave very high solubility in most cases, causing false positive. In theTat-based protein folding screening method of the prior art (forexample, ISELATE, Fisher JMB 2009), this false positive ratio tends toincrease with the increasing number of screenings using antibioticresistance, and this acts as substantial hindrance so that screening ofsoluble protein is impossible. Therefore, most of these methods are usedwith the purpose of, rather than inspecting a large-scale library,studying crystalline structures by securing solubility expressionthrough protein modification from a small-size mutant library (10⁵ to10⁶ size) of target proteins having difficulty in securing solubility(Pédelacq et al., Nat Biotechnol. 2002 20(9):927-32; Yang et al., Proc.Natl. Acad. Sci. 2003 100(2): 455-60).

However, according to the TAPE method according to the presentinvention, the entire library is inoculated on a liquid mediumcontaining a selective antibiotics (for example, ampicillin), but notusing the protein solubility screening method based on the existingsolid medium (plate). Therefore, there are no limitations on the size ofthe library (E. coli) applicable to screening at one time in the casewhere the volume of medium is increased, thereby achieving highthroughput screening. In addition, as described above, in the method ofthe prior art, it is highly like that the clone in which a self-ligatedmock vector and the above-described peptide-level gene fragments areintroduced is present as a false positive during a cloning procedure ofa library to an expression vector (for example, pET-TAPE) after thelibrary is screened from the liquid medium containing an antibiotics. Inorder to solve the false positive problem due to peptide-level genefragments, which is caused by inherent problems of this cloning methodusing a ligase, total plasmids are collected from the collected E. coli,and both terminals of the previously designed gene that expresses thefusion protein of target protein and TEM-1 beta-lactamase are treatedwith restriction enzyme. Then, a complete size of selective gene isisolated by gel electrophoresis and gel elution methods. As the result,only all true positive VH domain genes firstly screened by the TAPEmethod can be collected in full. Therefore, the TAPE method according tothe present invention has advantages in that only true positive clonescontaining only the gene construct of complete target protein can bescreened depending on the degree of resistance to antibiotics, withoutincreasing false positive in spite of the increasing number ofrepetitive screenings using antibiotic resistance.

Specifically, the TAPE system according to the present invention uses agene construct coding a fusion protein where a Tat-signal sequence isfunctionally linked to an N-terminal of a target protein, particularly aheavy chain domain, and an antibiotic resistant protein, particularly anantibiotic resistance-conferring protein, such as matured (a Sec pathwaysignal sequence is self-excluded) TEM-1 beta-lactamase or the like, isfunctionally linked to a C-terminal thereof.

The TAPE system or the TAPE method using the same uses principles thatafter a host cell, particularly, E. coli, is transformed with the geneconstruct, only E. coli express the properly folded antibiotic-resistantprotein in a soluble type by the Tat-signal sequence can survive underthe culture condition containing antibiotics.

When one host cell is transformed by only one gene construct, the targetprotein included in the surviving host cell is assumed to be properlyfolded in a soluble form.

In addition, soluble target proteins can be isolated in a large-scalehigh throughput manner through the TAPE method according to the presentinvention, by using a plurality of host cell groups, particularly groupsof E. coli, transformed with gene constructs coding different targetproteins.

The TAPE method comprises:

(1) culturing a host cell group in a liquid medium containingantibiotics, the host cell group being transformed with a gene constructcoding a fusion protein where a Tat-signal sequence is functionallylinked to an N-terminal of a target protein, particularly a heavy chainvariable domain, and an antibiotic-resistant protein is functionallylinked to a C-terminal thereof;

(2) collecting plasmid DNA from the antibiotic-resistant E. coli;

(3) collecting a nucleic acid sequence coding the target protein fromthe collected plasmid DNA; and

(4) confirming and screening a sequence of the target protein from thecollected nucleic acid sequence.

Particularly, the method may further comprise, after the stage (3), onestage selected from:

(3′) preparing a gene construct where the collected nucleic acidsequence is again functionally linked to a gene coding the Tat-signalsequence and an antibiotic resistance-conferring gene, and againtransforming the host cell group with the created gene construct,

or

(3″) directly transforming the host cell group with the plasmidcontaining the collected nucleic acid sequence, without preparing aseparate gene construct.

The stage (3″) has an advantage as compared with the stage (3′) in thata next stage is more promptly performed.

The stages (1) to (3′) or the stages (1) to (3″) may be repeated two ormore rounds, and this repetitive procedure can result in screening thetarget protein having solubility and high level of stability.

When the stages (1) to (3′) or the stages (1) to (3″) may be repeatedtwo or more rounds, finally the target protein can be identified byperforming the stage (4) of confirming and screening the sequence of thetarget protein, after the stage (3′) or (3″).

The TAPE method will be specifically described.

(1) A target protein, particularly, a human variable domain library isexpressed as a fusion protein form in the cytoplasm of each host cell,particularly E. coli. Here, only one particular fusion protein isexpressed in each host cell, particularly E. coli. Here, in the fusionprotein, the Tat-signal sequence is functionally linked to an N-terminalof the target protein, for example, a human immunoglobulin variabledomain, particularly VH and an antibiotic resistance-conferring protein,such as a matured (a Sec pathway signal sequence is self-excluded) TEM-1beta-lactamase or the like, is functionally linked to a C-terminalthereof.

(2) The fusion protein-expressed library is inoculated in a liquidscreening medium containing antibiotics, and a selection pressure isapplied thereto. Here, the concentration of the antibiotics contained inthe liquid screening medium may be 1×, 2×, 3×, 4×, or 5×, 8×, or 10× atthe initial round, based on 0.1 μg/ml (1×). The antibiotics used hereinmay be ampicillin, carbenicillin, or the like, but is not limitedthereto. Any antibiotics that can be appropriately used depending on theantibiotic-resistant protein used in the stage (1) may be used withoutlimitations. The expressed fusion protein passes through a Tat pathwaydepending on characteristics of the target protein and moves to theintracellular membrane. The fusion protein that fails to translocate dueto characteristics of the target protein, that is, does not havesolubility may form an inclusion body or may be degraded in thecytoplasm due to Tat proofreading mechanism. Only E. coli where thefusion protein moves to the periplasm can obtain resistance in theliquid screening medium containing antibiotics, by action of theantibiotic resistance-conferring protein, such as TEM-1 beta-lactamaseor the like, which is functionally linked to the C-terminal of thetarget protein.

(3) The plasmid DNA is collected from E. coli that survives in theliquid screening medium, and then treated with the previously designedrestriction enzyme, to collect the nucleic acid encoding only a fusionportion of the target protein and the antibiotic resistance-conferringprotein such as beta-lactamase, from the entire fusion protein, byelectrophoresis and gel elution methods,

or,

(3′) The plasmid DNA is collected from E. coli that survives in theliquid screening medium, and then E. coli is directly transformed withthe collected plasmid DNA, as described in Example 5, which follows thenext stage.

(4) The collected nucleic acid is cloned into a mock vector in orderthat it is functionally linked to the Tat signal sequence again.

After that, the stages 1) to 3) may be again repeated, so as to enrichthe ratio of genes expressing a protein having desired properties fromthe library. Here, a liquid medium for the next round may be selected tohave higher concentration of antibiotic than that of the previous round.

As the target protein in the present invention, particularly the targetprotein that can be screened by the TAPE method, any type of proteinthat has the desired functions may be used. Preferably, a protein havingbinding ability to a specific target (scFv, intrabody, domain antibody,Fab), a receptor protein, particularly a T-cell receptor (TCR), areceptor ligand, or the like may be used, but the target protein in thepresent invention is not limited thereto. More preferably, a domainantibody, for example, a VH domain antibody or a VL domain antibody issuitable.

The target protein in the present invention may have mutation. Formutagenesis of the target protein, mutation methods based onamplification, such as a method of synthesizing an oligomer which isdesigned such that amino acids at specific sites may be randomlymodified, and then employing an over-lapping polymerase chain reaction(PCR) using the oligomer, or a method of inducing random variation atrandom sites (error-prone PCR) in the PCR condition where the error rateof DNA polymerase is artificially increased, but the mutation methodsare not limited thereto.

The fusion protein including the target protein of the present inventionmay include a tag consisting of a particular amino acid sequence at theC-terminal thereof, in order to facilitate separation, purification, ordetection thereof. Any tag that is commonly used in the art to which thepresent invention pertains may be used without limitations as this tag.For example, the tag may be selected from 6×His tag, flag tag, c-myctag, and the like, but is not limited thereto.

Any vector that is known to be capable of being expressed in E. coli inthe art to which the present invention pertains may be used withoutlimitations as the vector for expressing the fusion protein, andnon-limited examples of this vector may include pET22b (Novagen), pAE34(AthenaES), pET9a (Novagen), ΔpMK, or the like (Lim H K et al.,Production Characteristics of Interferon-a Using an L-arabinose PromoterSystem in a High-cell-density Culture. Appl. Microbiol. Biotechnol.53(2): 201-208). As the promoter for inducing expression of the fusionprotein, a lac promoter, a T7 promoter, an arabinose promoter, or thelike may be used.

Only the target gene is collected from the library screened by using theTAPE method, and cloned into a new expression vector, so that only thetarget protein is alone expressed without the Tat signal and TEM-1beta-lactamase that have been positioned at the N-terminal and theC-terminal, respectively, and then, a purification procedure forindividual hits is performed. Here, the purification procedure may beeasily performed by including a tag at the C-terminal of the targetprotein in order to facilitate purification and analysis. As describedabove, any tag that is commonly used in the art to which the presentinvention pertains may be used without limitations. For example, the tagmay be selected from 6×His tag, flag tag, c-myc tag, and the like, butis not limited thereto. Also, the purification procedure may beperformed by using protein A affinity column according to the type ofvariable domains, for example VH3.

A ligand having desired properties may be screened depending on the kindof library used by the TAPE method according to the present invention.Examples of this ligand may include an immunoglobulin variable domain,particularly a domain antibody, a receptor, a receptor ligand, and thelike, but is not limited thereto. In particular, the ligand may be awild type, as well as may be one having mutation by inducing mutation inthe library or the like, as described above.

In addition, a gene sequence, that is, a base sequence for coding theligand may be obtained in the common manner.

It was confirmed that the ligand obtained by the TAPE method of thepresent invention, for example, a wild type ligand including receptor,receptor ligands, VH and VL from a germ line base sequence or theirmutated ligand screened from their combinatorial library exhibitspreferable physicochemical properties. In particular, it was confirmedthat the ligand was improved in solubility, long storage stability,self-folding ability in the cytoplasm of reducing environment, andthermostability.

A more preferably ligand may be, for example, a human immunoglobulinvariable domain obtained from a human immune cell cDNA library and amutant thereof. The mutant may be screened and obtained by using alibrary where amino acids are modified by using an NNK primer or thelike at a particular position of a frame portion of a particular wildtype human immunoglobulin variable domain.

When the heavy chain or light chain variable domain, that is, a VH or VLdomain antibody, which is screened by the TAPE method according to thepresent invention, has a corresponding frame sequence of human VH or VLdomain, solubility and thermostability thereof are still maintainedregardless of CDR sequences.

Therefore, the VH scaffold having excellent physical properties such ashigh solubility, thermostability, and the like, which is screenedthrough the present invention, may be used as a scaffold of the libraryfor obtaining a particular ligand, that is, a domain antibody targetinga desired target, that is, an antigen. Specifically, a library isconstructed by, while maintaining a scaffold of the screened mutant VHdomain antibody, inserting random CDR sequences thereinto, and then anantibody having binding ability to a desired target, that is, anantigen, may be screened from the library by using common methods suchas panning or the like.

Specifically, the above antibody may be screened by eluting all the VHdomain antibodies that are not bound to fixed desired antigens, exceptVH domain antibodies that are bound thereto, similarly to the commonphage display method or the like. The above procedure of eluting the VHdomain antibodies that are not bound to the fixed desired antigens isrepeated twice or more, thereby screening VH domain antibodies havinghigher binding ability to the targeting antigens.

In order to construct a CDR mutant library by using the scaffold of theVH domain antibody obtained in the present invention as described above,a corresponding variable region (for example, CDR in the case of a humanimmunoglobulin variable domain) may have various lengths, that is, thenumber of amino acid residues may be changed, or particular amino acidresidues may be replaced with other random amino acids. Alternatively,only some particular sites of the hyper variable region within CDR maybe randomly modified.

Accordingly, the present invention provides a library including randomCDR sequences in a scaffold of a VH or VL domain antibody screened bythe TAPE method and a producing method thereof, and provides a methodfor screening a VH or VL domain antibody having binding ability to adesired target protein by using the library, and a VH or VL domainantibody screened by the method, and also provides an amino acidsequence of the screened domain antibody and a polynucleotide encodingthe same.

These methods can improve physical properties of the human singlevariable domain antibody, particularly the VH domain antibody, therebyobtaining a single variable domain, particularly a VH domain antibodyhaving such excellent solubility and thermostability that cannot befound in the natural VH and VL.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a graph showing solubility of proteins screened by a TAPEmethod.

FIG. 2 is a schematic diagram showing a method of screening solubleproteins by the TAPE method, comprising:

(a) culturing a host cell group in a liquid medium containingantibiotics;(b) confirming a growth curve of the host cell group according toconcentration of antibiotics;(c) collecting plasmids to check presence or absence of nucleic acidscoding target proteins; and(d) preparing a gene construct where the collected nucleic acid sequenceis again functionally linked to a gene coding a Tat-signal sequence andan antibiotic resistance-conferring gene, and then again transformingthe host cell group with the prepared gene construct.

FIG. 3 is a view showing amino acid sequences of human VH domainsscreened from a human immunoglobulin heavy chain variable domain genelibrary by using TAPE method:

(a) sequences of FR1, CDRH1, FR2, and CDRH2 of the screened human VHdomains, and(b) sequences of FR3, CDRH3, and FR4 of the screened human VH domains.

FIG. 4 shows analysis results using SDS-PAGE about expression aspects inE. coli, of human VH domains screened from the human immunoglobulinheavy chain variable domain gene library by using the TAPE method,wherein, sol represents a soluble fraction after cell lysis and Inclrepresents an insoluble fraction after cell lysis, and an arrowindicates a band at a position of the corresponding VH molecular weight:

(a) expression aspects of VH domains known to have good solubility inthe prior art, and(b) a left box showing expression aspects of VH domains randomlyselected from the human immunoglobulin heavy chain variable domainderived from human germ line cells, and a right box showing expressionaspects of VH domains screened by the TAPE method.

FIG. 5 is a view showing a method of preparing an engineered library ofthe VH domain antibody scaffolds firstly screened by the TAPE method.

FIG. 6 is a view showing amino acid sequences of human VH domainsscreened by the TAPE method using the engineered library of VH domainantibody scaffolds:

(a) sequences of FR1, CDRH1, FR2, and CDRH2 of the screened human VHdomains, and(b) sequences of FR3, CDRH3, and FR4 of the screened human VH domains.

FIG. 7 shows analysis results using SDS-PAGE about expression aspects inE. coli, of human VH domains screened by the TAPE method using theengineered library of VH domain antibody scaffolds,

wherein, M denotes Marker, Lane 1 represents an expression aspect of acamelid domain antibody VHH, Lane a CDR synthetic human domain antibodyHEL4, Lane 3 MG2×1, and Lanes 4 to 32 represent expression aspects of VHscaffolds screened from a frame-engineered library, and the frames forrespective lanes are as follows: lane5:MG2-47, lane6:MG2-55,lane7:MG2-57, lane8:MG2-59, lane9:MG4-2, lane10:MG4-5, lane11:MG4-6,lane12:MG4-7, lane13:MG4-12, lane14:MG4-13, lane15:MG4-17,lane16:MG4-20, lane17:MG4-28, lane18:MG4-32, lane19:MG4-33,lane20:MG8-4, lane21:MG8-5, lane22:MG8-6, lane23:MG8-8, lane24:MG8-11,lane25:MG8-12, lane26:MG8-13, lane27:MG2-71, lane28:MG2-9I,lane29:MG2-10I, lane30:MG2-11I, lane31:MG2-12I, lane32:MG2-12L

FIG. 8 is a graph showing circular dichroism (CD) comparison results ofVH domains screened by the TAPE method.

FIG. 9 is a graph showing circular dichroism (CD) comparison results ofhuman VH domains screened by the TAPE method using the engineeredlibrary of VH domain antibody scaffolds.

FIG. 10 is a graph showing long-term storage stability of VH domainsscreened by the TAPE method.

FIG. 11 is a view showing a method of preparing an engineered libraryhaving a changed CDR length.

FIG. 12 is a view showing mutation positions of a rational library, forimproving binding ability to antigen:

(a) sequences of FR1, CDR H1, and FR2,(b) sequences of CDR H2 and FR3, and(c) sequences of CDR H3 and FR4.

FIG. 13 is a view showing a method of preparing a selectiveCDR-engineered library.

FIG. 14 shows analysis results using SDS-PAGE about expression aspectsin E. coli, of VH scaffolds according to the number of amino acidresidues of CDRH3 at the time of CDRH3 modification:

(a) a case in which the number of amino acid residues of CDRH3 is 7,(b) a case in which the number of amino acid residues of CDRH3 is 8,(c) a case in which the number of amino acid residues of CDRH3 is 9,(d) a case in which the number of amino acid residues of CDRH3 is 10,(e) a case in which the number of amino acid residues of CDRH3 is 11,(f) a case in which the number of amino acid residues of CDRH3 is 12,(g) a case in which the number of amino acid residues of CDRH3 is 13,and(h) a case in which the number of amino acid residues of CDRH3 is notchanged.

DETAILED DESCRIPTION OF EMBODIMENTS

Hereinafter, the present invention will be described in detail withreference to examples and the accompanying drawings. However, these areintended to explain the present invention in more detail, and the scopeof the present invention is not limited by the following examples.

Example 1 Preparation of pET-TAPE for Constructing TAPE System

In order to construct a twin-arginine transport (Tat)-associated proteinengineering (TAPE) system, pET-TAPE was prepared by linking a pathwaysignal sequence of TorA (E. coli trimethylamine N-oxide reductase) whichis a Tat substrate protein, that is, ssTorA to an N-terminal of a targetprotein and linking TEM-1 beta-lactamase to a C-terminal thereof whileusing a pET9a vector.

However, a signal sequence that leads the protein to a Tat pathway isnot limited to ssTorA, as mentioned above, and it is obvious to theordinary skilled person that signal sequences of all Tat pathwayproteins may be used. Also, it is obvious to the ordinary skilled personthat, as the used vector, any vector that can meet objects of thepresent invention, such as pET9a (New England Biolab), ApMA usingarabinose induction promoter (Korean Patent Laid-open Publication No.1996-007784), pAE34 using lac promoter, or the like, may be used.

In the case of using pET9a as the vector, when expression of a fusionprotein consisting of ssTorA, the target protein, and TEM-1beta-lactamase is induced by IPTG under the optimized culturingcondition, the fusion protein passes through the Tat movement pathway byguidance of the signal sequence. Here, only soluble and completelyfolded fusion protein passes through an intracellular membrane by Tatmachinery (Tat A, B, C), and as a result, the TEM-1 beta-lactamaselinked to the target protein moves to the periplasm of E. coli, due tofolding characteristics of the target protein. Antibiotic resistance ofE. coli is determined depending on the presence or absence of TEM-1beta-lactamase in the periplasm.

In the present invention, for a system for screening a humanimmunoglobulin heavy chain variable domain, a human immunoglobulin heavychain variable domain library, which is a target gene, was insertedbetween ssTorA and TEM-1 beta lactamase of the pET-TAPE vector.

An experimental procedure will be described in detail as follows. Afusion gene of ssTorA gene and a representative gene of humanimmunoglobulin heavy chain variable domain VH family type 2 (StefanEwert et al., Stability improvement of antibodies for extracellular andintracellular applications: CDR grafting to stable frameworks andstructure-based framework engineering. Methods 34 (2004) 184-199) wassynthesized by DNA oligomer synthesis and overlapping polymerase chainreaction (Genscript USA Inc., US). A polymerase chain reaction (PCR) wasinduced by using the synthesized ssTat-VH2 gene as a template whileusing 5′ direction primer (SEQ ID NO: 1) including NdeI sequence and 3′direction primer (SEQ ID NO: 2) including NotI, 6×his, and BamHIsequences.

The PCR reaction was performed by using the two primers, 1 mM of 0.5 UI-pfu DNA polymerase (iNtRON), each 2.5 mM of four kinds of dNTPs, and 5μl of 10× reaction buffer, and distilled was supplemented to the finalvolume of 50 μl. The PCR was run at 95□ for 2 minutes, followed by 30cycles of 94□ for 15 seconds, 56□ for 15 seconds, and 72□ for 30seconds, and finally 72□ for 5 minutes. The amplified DNA was loaded on1% of agarose gel to perform electrophoresis, and then, isolated byusing a QIAquick gel extraction kit (QIAGEN, Valencia, Calif., USA).

The NdeI-ssTorA-VH2-NotI-6×His-BamHI gene amplified through the PCR wasinserted between Ndel and BamHI cutting sites present in a multi-cloningsite (MCS) of the pET9a vector, to prepare a pET9a-ssTorA-VH2 plasmid.The NotI-TEM-1 beta-lactamase-BamHI segment, which was isolated byrunning PCR using 5′ primer (SEQ ID NO: 3) and 3′ primer (SEQ ID NO: 4)while using a TEM-1 β-lactamase (bla) gene as a template, was insertedbetween Notl and BamHI cutting sites of the pET9a-ssTorA-VH2 plasmid,and this was named pET-TAPE. After that, a library was constructed byremoving a VH2 region from the pET-TAPE and inserting a library genethereinto. In order to check whether or not the constructed TAPE systemis dependent on solubility of the corresponding protein, representativenatural type human immunoglobulin domain antibodies (Dp47d, VH2, VH3) ofwhich the degree of soluble expression in E. coli was previously known,a negative control gene (VH3-Bla, no signal sequence), and a positivecontrol gene (ssTorA-Bla, no target protein) were introduced in thepET-TAPE, and then the degree of antibiotic resistance of TEM-1beta-lactamase was measured. It was known that soluble expression in E.coli of VH family type 2 was very unfavorable, and then solubleexpression in E. coli of VH3 and DP47d was relatively favorable (Ewertet al., Stability improvement of antibodies for extracellular andintracellular applications: CDR grafting to stable frameworks andstructure-based framework engineering. Methods 34 (2004) 184-199).

Specifically, control human immunoglobulin heavy chain variable domainsof which protein solubility was previously known, a negative control (aconstruct where a representative gene of VH family type 3 is insertedinto the pET-TAPE and ssTorA is removed so as to prevent TEM-1beta-lactamase from reaching the periplasm) and a positive control(pET-TAPE itself, a construct where a VH gene is not inserted but linkedto ssTorA so as to express only TEM-1 beta-lactamase) were mounted onthe TAPE systems, and these were inoculated in a culture liquidcontaining an antibiotic agent. Then, the degree of antibioticresistance according to solubility was measured by counting total viablecells. An LB medium containing 50 μg/ml of ampicillin was used, andexpression was induced with IPTG for 3 hours, and then total viablecells were counted.

Results showed that known solubility of the corresponding gene isproportional to the degree of antibiotic resistance under the pET-TAPEsystem (see, FIG. 1). In FIG. 1, the increasing count per unit cellconcentration means stronger antibiotic resistance.

Example 2 Preparation of Immunoglobulin Heavy Chain Variable Domain (VH)Library Derived from Human Germline

The cDNA libraries were secured by reverse transcription of mRNAsobtained from the liver, peripheral blood mononuclear cells (PBMC),spleen, and thyroid of human.

In order to secure DNA sequences of human immunoglobulin heavy chainvariable domains from this, mixed primers depicted from SEQ ID NOs: 5 to13 were designed to secure all human heavy chain variable domain genesusable in the human germ cell line. Each of the secured human heavychain variable domain gene libraries was inserted between NdeI and BamHIsites of the pET-TAPE, to complete a library having a size of about 10⁸.

Specifically, cDNA was prepared from RNAs (Clontech, Madison, Wis., US)extracted from spleen, peripheral blood mononuclear cells, the liver,and thymus of human by a reverse transcription reaction. AMV reversetranscriptase and RNase inhibitor were purchased from Promega (Madison,Wis., USA). Respective RNAs were mixed to 1 μl of dNTP mixture (0.2 mM)and 1 μl of an oligo dT primer, and nuclease-free water was inputtedthereto to reach the total volume of 12 μl. For RNA denaturation, themixture was cultured at 65□, and then 4 μl of 5× strand buffer, 1 μl ofRNase inhibitor, and 2 μl of 0.1M DTT were inputted thereto. The reversetranscription reaction was run at 42□ for 15 minutes, and then left at70□ for 15 minutes. As the primers used in PCR, several degenerativeprimers were simultaneously used in order to obtain VH domains forrespective family types.

Primers depicted from SEQ ID NOs: 5 to 13 (see, Table 5) and eachincluding a forward Ncol sequence and primers depicted from SEQ ID NOs:14 and 15 (see, Table 5) and each including a reverse Notl sequence(Integrated DNA Technologies, Inc., Coralville, Iowa, US) were used. DNAwas amplified by using cDNA generated through the reverse transcriptionreaction as a template, the primer 10 pmolar for each case, 0.5 U of1-pfu DNA polymerase (Interon, Korea), four kinds of dNTP each 2.5 mM,and 5 μl of 10× buffer. The PCR was run at 95° C. for 2 minutes,followed by 30 cycles of 94° C. for 20 seconds, 56° C. for 20 seconds,and 72° C. for 2 minutes, and finally 72° C. for 7 minutes. The reactionmixture after the PCR was separated by electrophoresis using 1% ofagarose gel, and then purified by using a gel extraction kit (QIAGEN,Valencia, Calif., USA). The amplified PCR product and pET9a-TAPE plasmidwere cut with NcoI and NotI restriction enzymes, and purified by a PCRpurification kit (QIAGEN) and a gel extraction kit, respectively. Theamplified VH gene was inserted between Ncol and NotI cutting sites ofthe pET9a-TAPE, to prepare a human germ cell-derived VH library plasmid.The prepared library was concentrated by using an ethanol precipitationmethod.

ElectroMAX™ DH5a-E™ (Invitrogen, Carlsbad, Calif., US), which is E.coli, was transformed with 1 μl of DNA through electrophoration (BTXmodel ECM630, Holliston, Mass., USA). In order to verify a size of thelibrary, the transformed E. coli was sequentially diluted to 10⁻⁴ to10⁻⁸, and cultured in an LB agar medium containing Kanamycin. Afterthat, colonies were counted.

As a result, it was confirmed that the library size of VH1 family was9.1×10⁶, VH3 was 1.56×10⁹, and VH5 was 6.05×10⁸. The VH gene obtained byrandomly selecting 50 single colonies among them, followed by culturingin an LB liquid culture containing Kanamycin, and then isolatingrespective plasmids, using a DNA purification kit (QIAGEN, Valencia,Calif., USA). As the result that a base sequence of the VH gene wasanalyzed, it was confirmed that 90% or more genes were maintained in atranscript-able form.

TABLE 5 Sequences of primers used in the present invention SEQ ID NOSequence of primer  1 GCCATATGAACAATAACGATCTCTTTCAGGCATCACGT  2GCGGATCCATGGTGGTGATGGTGGTGTGCGGCCGCTGAAGAGACGGTCACCAACGT GCC  3GCGCGGCCGCACACCCAGAAACGCTGGTG  4 GCGGATCCTTACCAATGCTTAATCAGTGAGGC  5GCGCTAGCCAGGTKCAGCTGGTGCAG  6 GCGCTAGCCAGGTCCAGCTTGTGCAG  7GCGCTAGCSAGGTCCAGCTGGTACAG  8 GCGCTAGCCARATGCAGCTGGTGCAG  9GCGCTAGCCAGATCACCTTGAAGGAG 10 GCGCTAGCCAGGTCACCTTGARGGAG 11GCGCTAGCGARGTGCAGCTGGTGGAG 12 GCGCTAGCCAGGTGCAGCTGGTGGAG 13GCGCTAGCGAGGTGCAGCTGTTGGAG 14 GCGCGGCCGCTGAGGAGACGGTGAC 15GCGCGGCCGCTGAAGAGACGGTGAC 16 GCGCGGCCGCTGAGGAGACAGTGAC 17GCCCATGGGAAGTCCAACTGGTTGAATCTGGTGGCGGTTTAGTT 18AGTTGAACCGCCAGAGCCGGAAATMNNTGAGACMNNTTCMNNACCTTTGCCTGGCGCMNNACGCACCCAGCCCATAGCATAAGAAGAAAAGGTAAAGCCACTTGCAGCACAG CT 19ATTTCCGGCTCTGGCGGTTCAACTNNKTACNNKGATAGCGTTAAAGGTCGTTTCAC AATCTCC 20GCGCGGCCGCACTGCTCACAGTAACCAGGGTACCCTG where K means G or T, S means C orG, R means A or G, M means A or C, and N means A or T or G or C

An “antibody sequence numbering system” in the present invention means asystem where amino acids of the immunoglobulin variable domain, VH or VLare numbered. In CDRs of the antibody, the number of amino acids in theCDR differs from antibody to antibody. Therefore, numbering needs to beperformed on conservative amino acid sequences (for example, frameportions) and variable portions of an individual VH or VL of many kindswith a fixedly determined rule, starting from the N-terminal. Kabat,Chothia, and IMGT numbering systems are representative, and they differfrom each other depending on in what order amino acids of the CDRportion are numbered. In the present invention, the Kabat numberingsystem is used. For example, the Kabat numbering system follows thefollowing principles when numbering amino acids of the CDR1. The basicpremise is that the CDR of the antibody may be divided into ahypervariable region and a canonical structure structurally supportingthe hypervariable region according to degree of amino acid modification.For example, Frame 1, which is a first conservative frame, ends at a30^(th) amino acid. First, a 31^(st) amino acid to a 35^(th) amino acid,which corresponds to a canonical structure of the first variable region(CDR1), is numbered 31 to 35, respectively. If amino acids after the35^(th) amino acid are determined to be variable amino acids that arenot identical to Frame 2, they are numbered 35a, 35b, 35c . . . , inorder, until the hypervariable region ends. Therefore, according to theKabat numbering system, Frame 2 is assured to start from amino acid no.36. Also, numbering amino acids of CDR2 using the Kabat numbering systemis the same. First, a 50^(th) amino acid to a 52^(nd) amino acid, whichcorrespond to a canonical structure necessarily present in CDR2, werenumbered, and then the next amino acids are numbered 52a, 52b, 52c . . ., in order. Then, amino acids in a canonical structure of the rearportion of CDR2 are numbered 53 to 65, in order. Therefore, Frame 3 isassured to start from amino acid no. 66. Also, numbering amino acids ofCDR3 is the same. First, amino acids of a canonical structure by whichCDR3 starts were numbered 95 to 100, respectively, and amino acids ofthe following hypervariable region are numbered 100a, 100b, 100c, . . ., in order. Then, amino acids in a canonical structure of the rearportion of CDR3 are numbered 101 to 102, in order. Therefore, the finalframe connected thereto, Frame 4, surely starts from amino acid no. 103.

Example 3 Screening of VH Library Derived from Human Germline ThroughTAPE (Tat-Associated Protein Engineering)

(1) Construction of VH Library Derived from Human Germline

E. coli, T7 Express LysY/I^(q) was transformed with the pET-TAPE libraryby an electroporation method. Then, it was cultured in an SOC cultureliquid at 37□ for 1 hour, and then inoculated and cultured in an LBculture liquid containing 50 μg/ml of Carbenicillin (1×). When an ODvalue of E. coli was 0.6, E. coli was collected by using centrifugalseparation, and then the plasmid isolated by using a plasmid DNApurification kit (QIAGEN, Valencia, Calif., USA), and followed bycutting with restriction enzymes, NcoI and BamHI. The cut gene includesthe VH library and β-lactamase genes, and this is for excluding falsepositive that may arise in a subsequent liquid panning procedure. ThepET-TAPE plasmid also was cut with restriction enzymes, NcoI and BamHI.After cutting, respective DNAs were purified with a gel extraction kit(QIAGEN). The VH gene obtained from the pET9a-TAPE library screened inthe Carbenicillin LB culture liquid was inserted between NcoI and BamHIcutting sites of the pET-TAPE, and E. coli T7 Express LysY/I^(q) wasagain electro-transformed with this. After that, the above procedure wasrepetitively performed while the concentration of Carbenicillin in theculture liquid was increased to 250 μg/ml (5×) and 500 μg/ml (10×), forliquid panning. A schematic diagram for respective procedures is shownin FIG. 2.

Finally, after performing liquid panning for respective concentrationsof Carbenicillin, 50 single colonies were selected from an LB agar platecontaining ampicillin and then cultured in liquid medium containingampicillin. Then, plasmids were collected therefrom, followed byanalysis of base sequence. A culture method for analyzingcharacteristics of VH domains of the screened clone is as follows. E.coli, DH5a and T7 Express LysY/I^(q) were purchased from NEB (NewEngland BioLabs, INC., Beverly, Mass., US). In the case where E. coliincludes pET-TAPE plasmid base, E. coli was cultured in an LB cultureliquid containing 50 μg/ml of Kanamycin. In the case of where E. coliincludes a pET22b vector, 50 μg/ml of Ampicillin or Carbenicillin wasadded to the culture liquid. For seed culturing, colons separated fromthe LB solid medium as a single colony form were inoculated in the LBliquid medium containing the above antibiotic agent, and then, culturedat 200 rpm for 12 hours or longer at 37□. The colonies were inoculatedin the culture medium so that cell concentration of the seed cultureliquid was diluted to 1:100.

(2) Screening Result of VH Library Derived from Human Germline

Amino acid sequences of natural type human VH domain antibodies screenedby using the TAPE system from human immunoglobulin heavy chain variabledomain gene libraries prepared as described above were shown in FIG. 3.For each case, the scaffold of the screened natural type human VH domainantibody, that is, the sequences of FR1 to FR4 frames, were shown inTable 1.

When the repeated, identical sequences, among a total of 154 VHsequences separated from the final third liquid panning were markedonce, a total of 54 unique sequences could be obtained. Among the totalof 154 sequences, 148 sequences corresponding to 96% thereof weredetermined to be in a VH3 family type. The VH3 family was known to berelatively highly soluble among seven families of the humanimmunoglobulin heavy chain domains, and thus, this proves that screeningusing the TAPE system of the present invention significantly showsstatistically significant screening ability based on solubility.

In addition, it was found that frame sequences of 19 sequences amongindividual 54 sequences were unique, and among them, 13 frame sequenceswere classified in a VH3 family type.

In order to check the degree of soluble expression when the screenedindividual VH genes alone are expressed in E. coli without TEM-1beta-lactamase, which is a reporter gene, the soluble fraction and theinsoluble fraction were separated after induction of VH expression, andthen compared with various kinds of control VH domains through SDS-PAGE(see, FIG. 4).

The corresponding genes were cloned into the pET-22b(+) expressionvector, to transform E. coli NEBT7 as a host cell. For expression of thescaffolds, culture was performed under the conditions of 200 rpm at 37□,and then the expression was induced with 1 mM of IPTG when an OD valuewas 0.6 to 0.8. After the conditions of 180 rpm at 25□ for 3.5 hours,cells were collected.

A soluble fraction and an insoluble fraction of protein were separatedby using B-PER Reagent (Thermo scientific). After cell lysis, thesoluble fraction (supernatant portion) could be obtained by cell down.The precipitate (pellet) was washed with PBS, and then re-suspended withsolubilization buffer (pH 7.4, 50 mM NaH2PO4, 6M UREA, 0.5M NaCl, 4 mMDTT) to obtain the insoluble fraction. Expression thereof was analyzedby using SDS-PAGE.

As the result, it can be seen that, in the cases of VH domains (1, 2, 3)randomly selected from the library without the screening procedure,there was little VH domains having a corresponding size in the solublefraction, after induction of protein expression, and it can be seenthat, in the cases of VH domains screened by the TAPE procedureMG4×4-44, MG4×4-25, MG10-10, MG2×1), the soluble expression wasremarkably increased (see, FIG. 4( b)). It can also be seen that the VHdomains screened by the TAPE system had relatively higher ratio ofsoluble expression as compared with VH domains (VH2, VH3, VH6, DP47d,and HEL4) generally known to have excellent degree of solubleexpression.

Example 4 Preparation of Frame-Engineered Synthetic Library Based onMG2×1 VH Scaffold

In order to confer additive solubility and stability based on MG2×1 VHscaffold among the optimum natural type human immunoglobulin heavy chainvariable domain (VH) candidate groups screened by the TAPE method, a“frame-engineered synthetic library” where mutation was introduced atparticular 7 amino acid sites, which were rationally selected based onstructural analysis of VH, was constructed.

Amino acid mutation sites in the MG2×1 VH scaffold are indicated bysquare boxes (▪) in FIG. 6 based on the Kabat numbering system.

Specifically, mutation was introduced at the sequence of MG2×1 VHscaffold based on the MG2×1 scaffold firstly screened by the TAPEmethod. The polymerase chain reaction (PCR) for introducing mutation isdescribed in detail as follows.

The entire gene sequence was divided into two fragments to prepareprimers for polymerase chain reaction (PCR). Primers that introducemutations at the 3′ primer of the first fragment and the 5′ primer ofthe second fragment were prepared, and respective gene fragments weregenerated through PCR (see, Sequences 17 and 18 of Table 5). Then, afinal MG2×1 based frame-engineered synthetic library was constructedthrough overlapping PCR of two gene fragments (see, Sequences 18 and 19of Table 5 and FIG. 5). The amplified DNA was separated from the agarosegel, treated with restriction enzymes NcoI and NotI, and then insertedinto the pET-TAPE vector, thereby preparing a pET-TAPE frame-engineeredheavy chain variable domain synthetic library.

Example 5 Screening of MG2×1 VH Scaffold Based Engineered VH DomainsThrough TAPE (Tat-Associated Protein Engineering) System

New synthetic VH scaffolds having improved solubility and stability werescreened by using the TAPE system of the present invention as describedabove. Respective stages for TAPE were performed by increasingconcentration of Carbenicillin in the order of 50 μg/ml (hereinafter,“1× TAPE”), 100 μg/ml (hereinafter, “2× TAPE”), 200 μg/ml (hereinafter,“4× TAPE”), and 400 μg/ml (hereinafter, “8× TAPE”).

As the result that base sequences of 20 single colonies after 1× TAPEwere analyzed, it was confirmed that a pET-TAPE library including onlyTEM-1 β-lactamase gene was not found. Therefore, it was ultimatelyconfirmed that false positive can be excluded at an initial stage ofscreening by the TAPE system of the present invention. Then, when 2×TAPE, 4× TAPE, and 8× TAPE were performed, a method of collecting onlyVH genes after NcoI and BamHI restriction enzyme reactions and thenagain introducing them to the TAPE system, and a method of transformingE. coli with the pET-TAPE VH plasmid library separated from a cellculture liquid without the cloning work were simultaneously performed.False positive colonies were not found in both of the two methods. After8× liquid panning was finally performed, 50 single colonies wereselected from an LB solid medium, and then liquid-cultured. Then,plasmids were collected therefrom, and amino acid sequence analysis wasperformed.

As the result, it was found that particular amino acids were biasedlyselected at position Nos. 50 and 58 based on the Kabat numbering system,among seven modification positions. Specifically, it was found thatalanine was modified to tryptophan at position 50 in 16 among 41 clonesand tyrosine was modified to tryptophan at position 58 in 24 among 41clones. It was not observed that the amino acids were particularlybiased at the rest of the positions.

TABLE 2 Amino acid sequences of FR1 to FR4 frames of amino acid-modified VH domain antibody scaffold Scaffold name FR1 FR2 FR3 FR4MG8-21 EVQLVESGGGLVQPGG WVRNAPGKGN RFTISRDNSKNTLYLQMN WGQGTLVTVSSSLRLSCAASGFTF EIVS SLRAEDTAVYYCAS MG2-12L EVQLVESGGGLVQPGG WVRRAPGKGIRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EVVS SLRAEDTAVYYCAS MG2-7IEVQLVESGGGLVQPGG WVRIAPGKGP RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFEPVS SLRAEDTAVYYCAS MG2-9I EVQLVESGGGLVQPGG WVRKAPGKGYRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EPVS SLRAEDTAVYYCAS MG2-10IEVQLVESGGGLVQPGG WVRNAPGKGY RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFEIVS SLRAEDTAVYYCAS MG2-11I EVQLVESGGGLVQPGG WVRYAPGKGYRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EFVS SLRAEDTAVYYCAS MG2-12IEVQLVESGGGLVQPGG WVRVAPGKGI RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFEPVS SLRAEDTAVYYCAS MG2-32 EVQLVESGGGLVQPGG WVRMAPGKGPRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EHVS SLRAEDTAVYYCAS MG2-34EVQLVESGGGLVQPGG WVRSAPGKGV RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFEMVS SLRAEDTAVYYCAS MG2-40 EVQLVESGGGLVQPGG WVRTAPGKGTRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EMVS SLRAEDTAVYYCAS MG2-46EVQLVESGGGLVQPGG WVRCAPGKGY RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFEFVS SLRAEDTAVYYCAS MG2-47 EVQLVESGGGLVQPGG WVRIAPGKGLRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EMVS SLRAEDTAVYYCAS MG2-48EVQLVESGGGLVQPGG WVRMAPGKGL RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFEYVS SLRAEDTAVYYCAS MG2-51 EVQLVESGGGLVQPGG WVRYAPGKGTRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EFVS SLRAEDTAVYYCAS MG2-53EVQLVESGGGLVQPGG WVRQAPGKGV RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFEWVS SLRAEDTAVYYCAS MG2-55 EVQLVESGGGLVQPGG WVRWAPGKGPRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EFVS SLRAEDTAVYYCAS MG2-57EVQLVESGGGLVQPGG WVRFAPGKGR RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFEWVS SLRAEDTAVYYCAS MG2-58 EVQLVESGGGLVQPGG WVRFAPGKGCRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF ELVS SLRAEDTAVYYCAS MG2-59EVQLVESGGGLVQPGG WVRKAPGKGL RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFETVS SLRAEDTAVYYCAS MG2-60 EVQLVESGGGLVQPGG WVRNAPGKGLRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF ECVS SLRAEDTAVYYCAS MG2-64EVQLVESGGGLVQPGG WVRCAPGKGW RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFEVVS SLRAEDTAVYYCAS MG4-12 EVQLVESGGGLVQPGG WVRLAPGKGVRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF ELVS SLRAEDTAVYYCAS MG4-13EVQLVESGGGLVQPGG WVRFAPGKGA RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFEWVS SLRAEDTAVYYCAS MG4-17 EVQLVESGGGLVQPGG WVRLAPGKGRRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EWVS SLRAEDTAVYYCAS MG4-18EVQLVESGGGLVQPGG WVRYAPGKGV RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFEFVS SLRAEDTAVYYCAS MG4-20 EVQLVESGGGLVQPGG WVRFAPGKGLRFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EMVS SLRAEDTAVYYCAS MG4-28EVQLVESGGGLVQPGG WVRVAPGKGT RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTFERVS SLRAEDTAVYYCAS MG4-2 EVQLVESGGGLVQPGG WVRIAPGKGM RFTISRDNSKNTLYLQMNWGQGTLVTVSS SLRLSCAASGFTF EMVS SLRAEDTAVYYCAS MG4-32 EVQLVESGGGLVQPGGWVRAAPGKGP RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF ELVSSLRAEDTAVYYCAS MG4-33 EVQLVESGGGLVQPGG WVRVAPGKGY RFTISRDNSKNTLYLQMNWGQGTLVTVSS SLRLSCAASGFTF EHVS SLRAEDTAVYYCAS MG4-34 EVQLVESGGGLVQPGGWVRVAPGKGL RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF ECVSSLRAEDTAVYYCAS MG4-5 EVQLVESGGGLVQPGG WVRVAPGKGP RFTISRDNSKNTLYLQMNWGQGTLVTVSS SLRLSCAASGFTF ETVS SLRAEDTAVYYCAS MG4-6 EVQLVESGGGLVQPGGWVRMAPGKGS RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EVVSSLRAEDTAVYYCAS MG4-7 EVQLVESGGGLVQPGG WVRLAPGKGT RFTISRDNSKNTLYLQMNWGQGTLVTVSS SLRLSCAASGFTF EMVS SLRAEDTAVYYCAS MG8-11 EVQLVESGGGLVQPGGWVRTAPGKGA RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EWVSSLRAEDTAVYYCAS MG8-12 EVQLVESGGGLVQPGG WVRWAPGKGK RFTISRDNSKNTLYLQMNWGQGTLVTVSS SLRLSCAASGFTF EVVS SLRAEDTAVYYCAS MG8-13 EVQLVESGGGLVQPGGWVRQAPGKGI RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EPVSSLRAEDTAVYYCAS MG8-14 EVQLVESGGGLVQPGG WVRQAPGKGP RFTISRDNSKNTLYLQMNWGQGTLVTVSS SLRLSCAASGFTF EWVS SLRAEDTAVYYCAS MG8-4 EVQLVESGGGLVQPGGWVRQAPGKGP RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EVVSSLRAEDTAVYYCAS MG8-5 EVQLVESGGGLVQPGG WVRTAPGKGI RFTISRDNSKNTLYLQMNWGQGTLVTVSS SLRLSCAASGFTF EIVS SLRAEDTAVYYCAS MG8-6 EVQLVESGGGLVQPGGWVRIAPGKGV RFTISRDNSKNTLYLQMN WGQGTLVTVSS SLRLSCAASGFTF EIVSSLRAEDTAVYYCAS MG8-8 EVQLVESGGGLVQPGG WVRAAPGKGL RFTISRDNSKNTLYLQMNWGQGTLVTVSS SLRLSCAASGFTF EVVS SLRAEDTAVYYCAS

Example 6 Separation and Purification of the Screened VH ScaffoldCandidates and Analysis of Physicochemical Properties Thereof

In order to determine physical-chemical properties of the screenedhuman-derived VH scaffold candidates (see, FIG. 3) and the VH scaffoldcandidate group through synthetic mutation (see, FIG. 6), three analysisprocedures were performed.

First, the ratio of soluble expression level of VH scaffold screened waschecked in order to determine the degree of solubility of VH scaffoldcandidates. Second, a circular dichroism (CD) method was performed inorder to determine thermostability of respective scaffold candidategroups. Third, aggregation free characteristics of protein of thepurified scaffold candidates were confirmed by the ratio of monomers andlong storage stability through gel filtration chromatography.

(1) Separation and Purification of the Screened VH Scaffold Candidates

The screened gene was transported to an E. coli expression vector, andexclusively expressed without the reporter protein. PCR was run by usinga pET-TAPE-VH candidates plasmid as a template and using the 5′ primer(SEQ ID NO: 21 of Table 6) including an NcoI restriction enzyme basesequence and the 3′ primer (SEQ ID NO: 22 of Table 6) including an XhoIrestriction enzyme base sequence. DNA fragments corresponding to VHcandidates were amplified by PCR were treated with NcoI and XhoIrestriction enzymes, and then inserted between NcoI and XhoI cuttingsites of the pET22b(+) plasmid vector, thereby preparing a pET22b-VHplasmid. E. coli, T7 Express LysY/I^(q) was transformed with theprepared plasmid. Then, single colonies were selected, and theninoculated in an SB culture liquid containing 100 μg/mL of ampicillin,20 mM of MgCl₂, and 2%(w/v) of glucose. When optical density of theculture liquid was 0.6, 1 mM of IPTG was added thereto, and thenculturing was performed at 25□ for 4 hours for protein expression. E.coli was collected through centrifugal separation of the culture liquid,and then re-suspended in phosphate-buffered saline (PBS). The floatingE. coli was frozen and melted four times for lysing the cell wallthereof, and supernatant was collected by centrifugal separation. NaClwas added to the collected supernatant to have a concentration of 0.5M,and pH was set to 7.4 by using 5N of NaOH, followed by filtering with a0.22 μm filter. The protein was purified by using Ni-NTA affinitychromatography that is washed by washing with 100 mM of imidazole andeluted with 300 mM of imidazole. The purified protein was confirmed byelectrophoresis using NuPAGE 4-12% Bis-Tris gel purchased fromInvitrogen, followed by staining with Coomassie blue dye. With respectto the eluted protein, buffer exchange to phosphate-buffered saline(PBS) was performed through PD-desalting columns (GE Healthcare LifeScience, Piscataway, N.J., USA).

[Table 6] Sequences of Primers Used in the Present Invention

TABLE 6 Sequences of primers used in the present invention SEQ ID NOSequence of Primer 21 GCCCATGGGAAGTCCAACTGGTTGAATCTGGTGGC 22GCCTCGAGACTGCTCACAGTAACCAGGGTACCCT 23GCGCTAGCGAAGTCCAACTGGTTGAATCTGGTGGC 24 ACTGGCGCAGTAATACACTGCGGTATC 25GCAGATCTTGAGGAGACAGTGACCAGGGTTCCCTGGCCCCAMNNMNNMNNMNNMNNMNNMNNTCTGGCGCAGTAATACACTGCGGTATCTTCAGCACGCAG 26GCAGATCTTGAGGAGACAGTGACCAGGGTTCCCTGGCCCCAMNNMNNMNNMNNMNNMNNMNNMNNTCTGGCGCAGTAATACACTGCGGTATCTTCAGCACGCAG 27GCAGATCTTGAGGAGACAGTGACCAGGGTTCCCTGGCCCCAMNNMNNMNNMNNMNNMNNMNNMNNMNNTCTGGCGCAGTAATACACTGCGGTATCTTCAGCACGCAG 28GCAGATCTTGAGGAGACAGTGACCAGGGTTCCCTGGCCCCAMNNMNNMNNMNNMNNMNNMNNMNNMNNMNNTCTGGCGCAGTAATACACTGCGGTATCTTCAGCACGCAG 29GCAGATCTTGAGGAGACAGTGACCAGGGTTCCCTGGCCCCAMNNMNNMNNMNNMNNMNNMNNMNNMNNMNNMNNTCTGGCGCAGTAATACACTGCGGTATCTTCAGCACGCA G 30GCAGATCTTGAGGAGACAGTGACCAGGGTTCCCTGGCCCCAMNNMNNMNNMNNMNNMNNMNNMNNMNNMNNMNNMNNTCTGGCGCAGTAATACACTGCGGTATCTTCAGCAC GCAG 31GCAGATCTTGAGGAGACAGTGACCAGGGTTCCCTGGCCCCAMNNMNNMNNMNNMNNMNNMNNMNNMNNMNNMNNMNNMNNTCTGGCGCAGTAATACACTGCGGTATCTTCAG CACGCAG 32GCAGATCTTGAGGAGACAGTGACCAGGGTTCCCTGGCCCCAMNNMNNMNNMNNMNNMNNMNNGGAGGCGCAGTAATACACTGCGGTATCTTCAGCACGCAG 33GCAGATCTTGAGGAGACAGTGACCAGGGTTCCCTGGCCCCAMNNMNNMNNMNNMNNMNNMNNMNNGGAGGCGCAGTAATACACTGCGGTATCTTCAGCACGCAG 34GCAGATCTTGAGGAGACAGTGACCAGGGTTCCCTGGCCCCAMNNMNNMNNMNNMNNMNNMNNMNNMNNGGAGGCGCAGTAATACACTGCGGTATCTTCAGCACGCAG 35GCAGATCTTGAGGAGACAGTGACCAGGGTTCCCTGGCCCCAMNNMNNMNNMNNMNNMNNMNNMNNMNNMNNGGAGGCGCAGTAATACACTGCGGTATCTTCAGCACGCAG 36GCAGATCTTGAGGAGACAGTGACCAGGGTTCCCTGGCCCCAMNNMNNMNNMNNMNNMNNMNNMNNMNNMNNMNNGGAGGCGCAGTAATACACTGCGGTATCTTCAGCACGCA G 37GCAGATCTTGAGGAGACAGTGACCAGGGTTCCCTGGCCCCAMNNMNNMNNMNNMNNMNNMNNMNNMNNMNNMNNMNNGGAGGCGCAGTAATACACTGCGGTATCTTCAGCAC GCAG 38GCAGATCTTGAGGAGACAGTGACCAGGGTTCCCTGGCCCCAMNNMNNMNNMNNMNNMNNMNNMNNMNNMNNMNNMNNMNNGGAGGCGCAGTAATACACTGCGGTATCTTCAG CACGCAG 39CTGACGCACCCAGCCCATAGCATANNNNNNAAANNNAAAGCCACTTGCAGCACAGC TTAAGCG 40TATGCTATGGGCTGGGTGCGT 41GCTATCATCGTACCAAGTTGAACCGCCNNNGCCGGAAATCAATGAGAC 42GGCGGTTCAACTTGGTACGATGATAGC 43TCCCTGGCCCCAGTAGTCAGGAGCNNNAGTNNNCGGNNNATGTCTGGCGCAGTAAT ACACTGCGGTATC44 GCGGATCCTGAGGAGACAGTGACCAGGGTTCCCTGGCCCCAGTAGTCAGGAGC 45ACGCACCCAAGACATAGCATANNNNNNAAANNNAAAGCCACTTGCAGCACAGCTTA AGCG 46TATGCTATGTCTTGGGTGCGT 47AACGCTATCAGCGTAATAAGTTGAACCGCCNNNGCCGGAAATAGCTGAGAC 48GGCGGTTCAACTTATTACGCTGATAGCGTT where K means G or T, S means C or G, Rmeans A or G, M means A or C, and N means A or T or G or C

(2) Analysis on the Degree of Soluble Expression in E. Coli, of theScreened VH Candidates

In order to confirm the aspect of soluble expression in E. coli, of VHfirst screened through TAPE from a frame-engineered library based onMG2×1 VH scaffold, the corresponding VH only was expressed in the samemanner as Example 3 (2), and then a soluble fraction and an insolublefraction were separated, followed by SDS-PAGE analysis.

As the result, there can be obtained VH scaffold candidates of whichsoluble expression is improved as compared with natural type MG2×1separated from the human immunoglobulin heavy chain variable domainlibrary. As depicted in FIG. 7, most of the selected VH domainsexpressed in E. coli were soluble.

Among them, MG8-14, MG2-55, MG4-5, MG4-13, and MG8-4 scaffold candidates(arrows depicted in FIG. 7) show excellent soluble VH expression, and,as the analysis result of VH domains, it was confirmed that insolubleexpression ratio of the VH domains which exhibited excellent solubleexpression was decreased.

3) Analysis of Thermostability

In order to determine thermostability through analysis of the twodimensional protein structure of the VH scaffold candidate group, Tm(melting temperature; temperature at which 50% of the entire aqueousprotein is thermally denatured) of the VH scaffold candidates screenedby the TAPE procedure and purified was calculated by a circulardichroism (CD) method.

Folding fraction was represented by the ratio of the absorbance at acertain temperature to the absorbance before heating. The absorbance wasmeasured at a wavelength of 235 nm according to temperature change. In asigmoidal curve obtained therefrom, Tm means a temperature at which thefolding fraction is 50%.

The scaffold candidates screened from the natural type humanimmunoglobulin heavy chain variable domain library and the MG2×1 VHscaffold based frame-engineered synthetic library were purified, andthen diluted to a concentration of 0.2 to 0.3 mg/ml. Then CD thereof wasmeasured by using a spectropolarimeter (Jasco J-715 model, Jascoinc,Easton, Md., USA). CD signals were recorded and measured at a wavelengthof 235 nm when the temperature range of 25 to 85□ while increasing 1□per 1 minute.

Generally, protein aggregation occurs in most of the natural type humanimmunoglobulin heavy chain variable domains (VH) present in the aqueoussolution, and thus, CD measurement is impossible.

However, most of VHs screened by TAPE system are not aggregated whenthey exist alone. From the result of CD analysis of thermostability ofVH screened by TAPE system, MG3-10 has a Tm value of about 45° C. (see,Table 7), and MG4×4-44, MG4×4-25, MG10-10, and MG2×1 have Tm value ofabout 55 to 65° C. It means that thermostability thereof was improved byabout 10˜20° C. as compared with average Tm of natural VH (see, FIG. 8and Table 7).

Tm values of VH domains screened by the TAPE system from the MG2×1 VHscaffold based frame-engineered synthetic library (for example, MG2-12I,MG2-12L, MG4-13, MG8-4, MG8-14, and the like) were measured to have anaverage Tm of 65 to 75° C., and thus, it was confirmed thatthermostability thereof was improved by average 20˜30° C. (see, FIG. 9and Table 7).

TABLE 7 Tm values of domain antibodies screened by TAPE from humanimmunoglobulin variable domain library VH domains derive VH domainsderived from human Tm value from MG2X1 based Tm value germline (° C.)frame of engineered (° C.) MG4x4-44 55.6 MG4-5 67.8 MG3-10 46.5 MG4-1365.2 MG4x4-25 61.8 MG8-4 72.3 MG10-10 55.4 MG8-14 76.5 MG2-1 65.2 MG2-5569.9 MG2-12I 66.6 MG2-7I 77.0

4) Analysis of the Degree of Aggregation of VH Screened

Stability of the protein was investigated by measuring the aggregationof candidate VH frames according to long-term storage. VH scaffoldcandidates purified with a concentration of 0.2 to 0.8 mg/mL were storedat 37° C. at 60% of humidity.

Samples were taken at a predetermined interval during storage for about20 days, and then aggregated protein was removed therefrom bycentrifugal separation. Then, the concentration of protein remainingaqueous was measured, and the ratio thereof was calculated.

The remaining protein was separated by electrophoresis using NuPAGE4-12% Bis-Tris gel, and then quality thereof was confirmed throughstaining with Coomassie blue dye. As the result, it can be seen thatmost of the protein dissolved in the phosphate-buffered saline (PBS) wasstably maintained without protein aggregation at the accelerationcondition (37° C.) for 60 days or longer. Each data point represents aratio of protein content remaining in the aqueous solution when theinitial protein content sets as 1 (see, FIG. 10).

Example 7 Construction of Engineered Libraries According to Lengths ofCDRH3 for Conferring Antigen Binding Ability to Candidate VH Scaffolds

In order to confer diversity of antigen binding ability to the MG2×1scaffold screened from the human germline derived VH library by the TAPEmethod, and the MG8-4 and MG8-14 scaffold screened from the frameengineered library by the TAPE method, a CDR3 synthetic libraryaccording to the length of amino acids (7 to 13 amino acids) wasconstructed at the CDRH3 portion of the corresponding scaffold.

This was deduced from the existing study results that the length of CDR3of the VH domains having the most suitable stability and binding abilityto the target protein corresponds to the length of 7 to 13 amino acids(Christoper J Bond. et al., J. Mol. Biol. 2005 348: 699-709). Inaddition, amino acids were coded by NNK nucleotide triplet, so that all20 amino acids may be coded even while lowering the ratio of stop codonthan NNN and NNS nucleotide triplets. Moreover, R94S NNK library whichenhanced the flexibility of CDR3 by replacing arginine (amino acidresidue preceding CDR3) to serine was constructed by the methods setforth above.

In order to minimize error from the PCR, a template DNA fragmentcomprising frame 1 to frame 3 was constructed by PCR. DNA amplificationwas performed on cDNAs of MG2×1, MG8-4 and MG8-14 respectively as atemplate by using 10 pmolar of 5′ primer (SEQ ID NO: 23 of Table 6) and3′ primer (SEQ ID NO: 24 of Table 6), 0.5 U of I-pfu DNA polymerase,four kinds of dNTP each 2.5 mM, and 5 μL of 10× buffer. The PCR wasrepeated 25 cycles at the conditions of;

exposed 94° C. for 2 minutes followed by 94° C. for 20 seconds, 56° C.for 20 seconds, and 72° C. for 25 seconds for 25 times, and finally 72°C. for 5 minutes.

In order to give diversity of antigen binding ability to the templateDNA, primers (SEQ ID NOs: 25 to 31) where combinatorial CDRH3 library isintroduced with a length of 7 to 13 amino acids was prepared. A “CDR3engineered library according to the length” was constructed by the PCRusing the constructed DNA fragment as a template, 5′ primer, and 3′primer of introducing amino acid diversity according to the length (SEQID NOs: 25 to 31). PCR was repeatedly performed 25 cycles under thefollowing condition;

exposed 94° C. for 2 minutes, followed by 25 cycles of 94° C. for 20seconds, 56° C. for 20 seconds, and 72° C. for 40 seconds for 25 times,and finally 72° C. for 5 minutes. FIG. 11 shows a schematic diagramabout construction of the CDRH3 engineered library according to thelength for conferring diversity of antigen binding ability.

Example 8 Construction of Rational CDR Engineered Libraries forConferring Diversity of Antigen Binding Ability to Candidate VHScaffolds

For the same purpose as Example 7, a CDR engineered library forconferring diversity of antigen binding ability to the MG2×1 or MG8-4 orMG8-14 based scaffold VH was constructed. Unlike Example 7 wherediversity was conferred according to length of CDRH3, only sequencesthat are expected to have antigen binding ability at the time whenmutation is introduced while maintaining the length of CDR wererationally selected, and random mutation was introduced to thecorresponding genes. While maintaining the lengths of three CDRs,mutation was selectively introduced to only positions to which antigensare likely to bind (SDRs: Specific Determining Residues), throughstructure analysis. This has an advantage in that a change in stabilityand immunogenicity problem due to CDR mutation can be minimized byintroducing mutations at only the minimum positions of the CDR.

As for the SDR, first, SDS was selected by referring modeling data viaSWISS-Model system and the canonical structures thereof, or modelingdata and anticipated binding ability according to nucleotidecharacteristics.

In addition, amino acids were designed by introducing biased nucleotideshaving a relatively increased ratio of tyrosine and serine, which areknown to have higher antigen binding ability than other amino acids, sothat the probability of CDR binding is high even with the same librarysize (Akiko Koide et al., PNAS 2007 104(16):6632-6637). In order tointroduce mutations at the respective CDR1, CDR2, and CDR3, the gene wasdivided into 3 portions and mutations were introduced thereat (see FIG.12). The respective fragments were secured by the PCR method while theframe MG2×1 or frame MG8-4, MG8-14 cDNA was used as a template, asfollows.

The first, second, and third DNA fragments of MG8-4, MG8-14 basedlibrary were synthesized through PCR by using a 5′ primer (SEQ ID NO:23) and a 3′ primer (SEQ ID NO: 39), a 5′ primer (SEQ ID NO: 40) and a3′ primer (SEQ ID NO: 41), and a 5′ primer (SEQ ID NO: 42 and a 3′primer (SEQ ID NO: 43), respectively.

And the first, second, and third DNA fragments of MG2×1 based librarywere synthesized through PCR by using a 5′ primer (SEQ ID NO: 23) and a3′ primer (SEQ ID NO: 45), a 5′ primer (SEQ ID NO: 46) and a 3′ primer(SEQ ID NO: 47), and a 5′ primer (SEQ ID NO: 48) and a 3′ primer (SEQ IDNO: 43), respectively.

Synthesizing of DNA fragments set forth above was carried out by PCRwith 10 pmolar of each primer, 0.5 U of vent DNA polymerase, four kindsof dNTP each 10 mM, and 5 μL of 10× buffer.

The PCR was run at 94° C. for 2 minutes, followed by 25 cycles of 94° C.for 15 seconds, 56° C. for 20 seconds, and 72° C. for 25 seconds, andfinally 72° C. for 5 minutes.

The entire size of rational CDR engineered library was completed byoverlapping PCR using the thus obtained 3 template fragments, 5′ primer(SEQ ID NO: 23), and 3′ primer (SEQ ID NO: 44). The PCR was run at 94°C. for 2 minutes, followed by 25 cycles of 94° C. for 20 seconds, 56° C.for 20 seconds, and 72° C. for 40 seconds, and finally 72° C. for 5minutes. FIG. 13 shows a schematic diagram about construction of therational CDR engineered library for conferring diversity of antigenability.

Example 9 Study on Effects of CDR Modification in VH Scaffold BasedLibraries (Engineered Library According to CDR Length and Rational CDREngineered Library) on Protein Stability

In order to confirm effects of CDR modification on stability of VHscaffold structure, genes were randomly selected from a syntheticlibrary of CDRH3 having 7 to 13 amino acids and a library wheremutations were rationally introduced at particular positions of CDRH1,CDRH2, and CDRH3 (about 8 genes are screened per each CDR engineeredlibrary), and they were cloned into an exclusive expression vector. Asthe expression vector, a pET-22b(+) expression vector was used, and E.coli DH5a, as a host cell, was transformed. The transformed colonieswere randomly selected, and sequences thereof were analyzed. Plasmidswhere all the genes are well maintained were isolated without a stopcodon, and E. coli NEBT7 as a host cell was again transformed, therebyinducing expression of the corresponding genes. Culturing was performedunder the expression conditions of 37° C. and 200 rpm, and thenexpression was induced with 1 mM of IPTG when an OD value was 0.6 to0.8. After the conditions of 180 rpm at 25° C. for 3.5 hours, cells wereharvested. A soluble protein fraction and an insoluble protein fractionwere separated in the same manner as Example 3.

The VH randomly selected from each CDR engineered library wasexclusively expressed, and then a soluble fraction and an insolublefraction were separated and analyzed by SDS-PAGE. As the result, it wasconfirmed that, in the case of CDRH3 having seven amino acids, allsamples except one were expressed as a soluble form (see, FIG. 14( a)).

It was confirmed that, in the case of CDRH3 having eight amino acids,seven of eight samples were expressed as a soluble form (see, FIG. 14(b)). It was confirmed that, in the case of CDRH3 having nine aminoacids, all samples were expressed as a soluble form (see, FIG. 14( c)).It was confirmed that, in the case of CDRH3 having ten amino acids,eight of nine samples were expressed as a soluble form (see, FIG. 14(d)). It was confirmed that, in the case of CDRH3 having eleven aminoacids, all nine samples were expressed as a soluble form (see, FIG. 14(e)). It was confirmed that, in the case of CDRH3 having twelve aminoacids, six of eight samples were expressed as a soluble form (see, FIG.14( f)). It was confirmed that, in the case of CDRH3 having thirteenamino acids, six of seven samples were expressed as a soluble form (see,FIG. 14( g)). It was confirmed that, in the case of the rational CDRengineered library, all samples were expressed as a soluble form. It wasconfirmed that soluble expression was overall stably induced regardlessof CDR modification, in the CDR engineered libraries prepared based onframes of the VH domain antibodies screened by the TAPE method (see,FIG. 14 and Table 8).

TABLE 8 Frequency of soluble expression of VH after introduction of CDRmodification Frequency of soluble CDR3 expression (% out of Library typelength (a/a) clones tested) Engineered 7 88 (7/8) libraries 8 87 (7/8)according to 9 100 (8/8) CDRH3 length 10 89 (8/9) 11 100 (9/9) 12 75(6/8) 13 86 (6/7) Overall 89 (51/57) Rational CDR 11 100 (11/11)engineered library

Example 10 Screening of VH Domain Antibody Candidates Based on DisplayTechnology Using VH Domain Antibody Libraries Having VH ScaffoldsScreened by TAPE Method

In the present invention, phage display, yeast display, ribosomedisplay, or the like, is preferable as display technology usable inorder to screen VH domain antibody candidates by using VH domainantibody libraries having VH scaffolds screened by the TAPE method, butis not limited thereto.

According to the phage display technology, foreign peptides and proteinsare inserted and fused to capsid protein of bacteriophage so that theforeign proteins are expressed on the phage surface (Smith et al.,Science 1985 228(4705): 1315-1317). In the present invention, a domainantibody having strong binding ability was screened by binding thetarget antigen to the fusion protein expressed on the surface of thefixed phage.

To screen VH having binding ability to a specific antigen, a panningscheme set forth below was used (Carlos F. Barbas III et al. PhageDisplay—A Laboratory Manual, Cold Spring Harbor Laboratory Press);

-   -   Cloning the VH library cloned into the phagemid vector (pComb3X)        (Examples 7 and 8);    -   VH domains were expressed at the end of phage;    -   contacting the expressed VH domains with a target protein;    -   selecting VH domains which had good binding ability to a target        protein.

After contacting the target protein with VH domain expressed at the endof phage, washing out unbound phage and eluting only VH domains-targetprotein complex. Consequently, only phages which expressed VH domainwere concentrated.

By repeating 5˜6 rounds of panning process set forth above, VH domainantibody which strongly bound to a target antigen could be screened.

In addition, in the present invention, the yeast display method was alsoused. According to the yeast display method, after the VH library(Examples 7 and 8) was cloned into the yeast surface expression vector,VH domains were expressed on the surface of yeast, and bound to thetarget protein, thereby screening and eluting only domains having goodbinding ability (Ginger Chao. et al., Nature Protocol 2006 1(2):755-768). A tag was attached to or biotin was labeled on the targetprotein, and this was reacted with the displayed VH domains. Then, afluorescent protein targeting the bound protein and a fluorescentprotein targeting the displayed VH domain were respectively labeled.Only the labeled yeast-target protein complex, which is shown in adesired region, was eluted by using fluorescence activated cell sorting(FACS), thereby collecting only yeast cells displaying VH domainsspecific to the target protein.

In the present invention, the ribosome display method was used in orderto screen VH domains having binding ability to particular antigens.According to the ribosome display method, stop codon was removed frommRNA coding the screened VH scaffold and then synthesis of in vitroprotein was performed. Then, a ribosome complex to which the protein andmRNA corresponding thereto were linked was formed (Mingyue H. et al.,Nature Protocol 2007 4(3):281-288). Panning was performed on the targetantigen to screen complexes having desired antibodies, and resultantly,desired complexes could be enriched. The screened mRNA was reverselytranscribed to DNA, and this procedure was repeated three or four timesuntil desired result was obtained. When screening by the above methodwas performed on the antigen binding libraries prepared by Examples 7and 8, VH domain antibodies that are strongly bound to target antigensand have stable properties such as high solubility, thermostability, andlong storage stability can be screened.

According to the three screening techniques set forth above, VH domainswith high affinity to human serum albumin (HAS) and Human EpidermalGrowth Factor Receptor-3 (HER3) were screened by using HAS or HER3 astarget antigens.

The screened VH domains maintained the property of VH scaffold andshowed high productivity as soluble form in E. coli.

Amino acid sequences and affinity of VH having high affinity to HAS orHER3 out of the screened VH domains are showed in Table 9.

TABLE 9 Amino acid sequences and affinity screened from the librariesaccording to lengths of CDRH3 by using HAS and HER3 as a target antigenTarget Affinity Antigen Clone Amino acid sequences (nM) HSA HSA9EVQLVESGGGLVQPGGSLRLSCAASGFTFSSYAMGWVRQAPGKG 1.7PEVVSLISGSGGSTWYDDSVKGRFTISRDNSKNTLYLQMNSLRAEDTAVYYCASHQWSRQQWGQGTLVTVSS HSA11EVQLVESGGGLVQPGGSLRLSCAASGFTFSSYAMGWVRQAPGKG 1.2PEVVSLISGSGGSTWYDDSVKGRFTISRDNSKNTLYLQMNSLRAEDTAVYYCASHKFRNLKWGQGTLVTVSS HSA50EVQLVESGGGLVQPGGSLRLSCAASGFTFSSYAMGWVRQAPGKG 7.8PEVVSLISGSGGSTWYDDSVKGRFTISRDNSKNTLYLQMNSLRAEDTAVYYCASHQFTTTQWGQGTLVTVSS HER3 HER3#62EVQLVESGGGLVQPGGSLRLSCAASGFTFSSYAMGWVRQAPGKG 1.8PEVVSLISGSGGSTWYDDSVKGRFTISRDNSKNTLYLQMNSLRAEDTAVYYCASHPPRVDTWGQGTLVTVSS HER3#723EVQLVESGGGLVQPGGSLRLSCAASGFTFYNYPMGWVRQAPGKG 27.4PEVVSLISGSGGSTWYDDSVKGRFTISRDNSKNTLYLQMNSLRAENTAVYYCASHPVSLLFWGQGTLVTVSS HER3#748EVQLVESGGGLVQPGGSLRLSCAASGFTFYSLMMGWVRQAPGKG 20.0PEVVSLISGSGGSTWYDDSVKGRFTISRDNSKNTLYLQMNSLRAEDTAVYYCASRHPPGLMWGQGTLVTVSS

1. A gene construct coding a fusion protein comprising a Tat signalsequence is linked to an N-terminal of a target protein and anantibiotic resistance-conferring protein is linked to a C-terminal ofthe target protein.
 2. The gene construct of claim 1, wherein the targetprotein is an immunoglobulin variable domain, an antibody fragment, areceptor, or a receptor ligand.
 3. The gene construct of claim 2,wherein the target protein is a heavy chain variable domain ofimmunoglobulin.
 4. The gene construct of claim 1, wherein the fusionprotein further includes amino acid tag for separation, purification, ordetection linked thereto.
 5. The gene construct of claim 4, wherein theamino acid tag is selected from the group consisting of 6×His tag, flagtag, and c-myc tag.
 6. The gene construct of claim 1, wherein the Tatsignal sequence is selected from the group consisting of TorA, CuoO,DmsA, FdnG, FdoG, HyaA, NapA, Sufl, WcaM, TagT, YcbK, YcdB, YdhX, andYnfE.
 7. The gene construct of claim 1, wherein the antibioticresistance-conferring protein is TEM-1 beta-lactamase, and the Tatsignal sequence is TorA.
 8. A vector comprising the gene construct ofclaim
 1. 9. The vector of claim 8, wherein the vector is any one vectorselected from the group consisting of pET22b, pAE34, pET9a, and ΔpMK,and the gene construct is inserted into the selected vector.
 10. Thevector of claim 8, wherein the vector is pET-TAPE.
 11. A host celltransformed with the vector of claim
 8. 12. The vector of claim 11,wherein the host cell is E. coli.
 13. A host cell group comprising oneor more host cells transformed with the gene construct of claim
 1. 14.The host cell group of claim 13, wherein the host cells comprising thehost cell group are transformed with gene constructs respectively codingfusion proteins comprising different target proteins, respectively. 15.The host cell group of claim 13, wherein the host cell is agram-negative bacteria strain.
 16. A method of screening a solubletarget protein, the method comprising: (1) culturing the host cell groupof claim 13 in a liquid medium containing at least one antibiotic; (2)collecting surviving host cells to collect plasmid DNA; (3) collecting anucleic acid sequence coding the target protein from the collectedplasmid DNA; and (4) confirming and screening a sequence of the targetprotein from the collected nucleic acid sequence.
 17. The method ofclaim 16, further comprising, after the stage (3), one stage selectedfrom: (3′) preparing a gene construct wherein the collected nucleic acidsequence is again functionally linked to a gene coding the Tat-signalsequence and an antibiotic resistance-conferring gene, and then againtransforming the host cell group with the prepared gene construct, and,(3″) directly transforming the host cell group with the plasmidcomprising the collected nucleic acid sequence, without preparing aseparate gene construct, wherein the stages (1) to (3′) or the stages(1) to (3″) are repeated two or more rounds.
 18. The method of claim 16,wherein the antibiotics are ampicillin or carbenicillin.
 19. The methodof claim 17, wherein the antibiotics are added at a gradually increasingconcentration as the round is repeated.
 20. The method of claim 16,wherein the antibiotics have a concentration of 0.05 to 0.2 μg/ml at theinitial round.
 21. A soluble VH domain antibody scaffold comprisingamino acid sequences of FR1 to FR4 below: (1) FR1:X₀VQLX₁X₂X₃GX₄X₅X₆X₇X₈PGX₉SX₁₀X₁₁X₁₂X₁₃CX₁₄X₁₅X₁₆GX₁₇X₁₈X₁₉ in the aminoacid sequence of FR1, X₀ is E or Q, X₁ is V or L, X₂ is E, or Q, X₃ isS, or A, X₄ is G, or A, X₅ is G, M, N, V, or E, X₆ is L, V, or W, X₇ isV, K, A, or I, X₈ is Q, K, or H, X₉ is G, T, A, R, E, S, or T, X₁₀ is L,V, R, or M, X₁₁ is R, or K, X₁₂ is L, I, or V, X₁₃ is S, A, or T, X₁₄ isA, E, V, R, I, K, T, or S, X₁₅ is A, G, P, V, or T, X₁₆ is S, F, or Y,X₁₇ is F, Y, R, G, or L, X₁₈ is T, A, S, N, T, P, I, N, H, or A, X₁₉ isF, L, V, or C; (2) FR2: WX₂₀RX₂₁X₂₂PGX₂₃GX₂₄X₂₅X₂₆X₂₇X₂₈ in the aminoacid sequence of FR2, X₂₀ is V, A, or L, X₂₁ is Q, N, R, I, K, Y, V, M,S, Q, W, F, L, V, or C, X₂₂ is A, G, K, S, V, M, or T, X₂₃ is K, Q, E,R, or T, X₂₄ is L, N, I, P, Y, T, V, W, A, R, M, or S, X₂₅ is V, or E,X₂₆ is W, I, V, P, F, H, M, Y, L, C, or R, X₂₇ is V, M, I, or L, X₂₈ isS, A, or G; (3) FR3:X₂₉X₃₀X₃₁X₃₂X₃₃X₃₄X₃₅X₃₆X₃₇X₃₈X₃₉X₄₀X₄₁X₄₂X₄₃X₄₄X₄₅X₄₆X₄₇X₄₈X₄₉X₅₀X₅₁DX₅₂X₅₃X₅₄YX₅₅CX₅₆X₅₇In the amino acid sequence of FR3, X₂₉ is R, H, Q, or T, X₃₀ is F, V, L,or I, X₃₁ is T, S, or I, X₃₂ is I, L, V, M, or R, X₃₃ is S, T, or D, X₃₄is R, A, V, N, or I, X₃₅ is D, N, or A, X₃₆ is N, T, D, I, R, K, Y, orE, X₃₇ is A, S, V, or T, X₃₈ is K, R, T, Q, V, E, M, N, or I, X₃₉ is N,R, T, K, S, D, or V, X₄₀ is T, M, S, V, I, Y, or A, X₄₁ is L, V, A, orM, X₄₂ is F, Y, N, D, H, or S, X₄₃ is L, or M, X₄₄ is Q, E, H, or N, X₄₅is M, L, V, I, or W, X₄₆ is N, T, K, D, Y, I, or S, X₄₇ is S or N, X₄₈is L or V, X₄₉ is R, K, or T, X₅₀ is D, A, S, P, T, V, I, or S, X₅₁ isE, A, D, or S, X₅₂ is T, N, or S, X₅₃ is S, A, or G, X₅₄ is V, I, L, orM, X₅₅ is Y or F, X₅₆ is A, G, V, or S, X₅₇ is R, S, K, T, L, N, or F;and (4) FR4: X₅₈GX₅₉GX₆₀X₆₁VTVSS in the amino acid sequence of FR4, X₅₈is W, C, Y, G, S, or A, X₅₉ is Q, R, or L, X₆₀ is A, T, I, or V, X₆₁ isL, M, P, V, or T.
 22. The VH domain antibody scaffold of claim 21,wherein the FR1 to FR4 comprise amino acid sequences below: (1) FR1:X₀VQLX₁X₂SGGX₅X₆X₇X₈PGX₉SX₁₀RX₁₂SCX₁₄X₁₅SGX₁₇X₁₈X₁₉ in the amino acidsequence of FR1, X₀ is E or Q, X₁ is V or L, X₂ is E or Q, X₅ is G, N,V, or E, X₆ is L or V, X₇ is V or K, X₈ is Q, K or H, X₉ is G, T, A, R,E, or T, X₁₀ is L or V, X₁₂ is L or V, X₁₄ is A, E, V, I, K, or S, X₁₅is A, G, or V, X₁₇ is F, Y, R, G, or L, X₁₈ is T, A, S, N, T, P, I, N,H, or A, and X₁₉ is F, L, V, or C; (2) FR2:WVRX₂₁X₂₂PGX₂₃GX₂₄X₂₅X₂₆X₂₇X₂₈ in the amino acid sequence of FR2, X₂₁ isQ, N, R, I, K, Y, V, M, S, Q, W, F, L, V, or C, X₂₂ is A, G, K, S, or M,X₂₃ is K, Q, E, R, or T, X₂₄ is L, N, I, P, Y, T, V, W, A, R, M, or S,X₂₅ is V or E, X₂₆ is W, I, V, P, F, H, M, Y, L, C, or R, X₂₇ is V, M,I, or L, and X₂₈ is S, A, or G; (3) FR3:RX₃₉TX₃₂SX₃₄DX₃₆X₃₇X₃₈X₃₉X₄₀X₄₁X₄₂X₄₃X₄₄X₄₅X₄₆X₄₇X₄₈X₄₉X₅₀X₅₁DTAX₅₄VX₅₅CX₅₆X₅₇in the amino acid sequence of FR3, X₃₀ is F, V, L, or I, X₃₂ is I, L, V,or M, X₃₄ is R, A, V, or I, X₃₆ is N, T, D, I, R, K, Y, or E, X₃₇ is A,S, V, or T, X₃₈ is K, R, T, Q, V, E, M, N, or I, X₃₉ is N, R, T, K, S,D, or V, X₄₀ is T, M, S, V, I, Y, or A, X₄₁ is L, V, A, or M, X₄₂ is F,Y, N, D, H, or S, X₄₃ is L or M, X₄₄ is Q, E, H, or N, X₄₅ is M, L, V,I, or W, X₄₆ is N, T, K, D, Y, I, or S, X₄₇ is S or N, X₄₈ is L or V,X₄₉ is R, K, or T, X₅₀ is D, A, S, P, T, V, I, or S, X₅₁ is E, A, D, orS, X₅₄ is V, I, L, or M, X₅₅ is Y or F, X₅₆ is A, G, V, or S, and X₅₇ isR, S, K, T, L, N, or F; and (4) FR4: X₅₈GQGX₆₀X₆₁VTVSS in the amino acidsequence of FR4, X₅₈ is W, C, Y, G, S, or A, X₆₀ is A, T, I, or V, andX₆₁ is L, M, V, or T.
 23. The VH domain antibody scaffold of claim 21,wherein the VH domain antibody scaffold is selected from scaffoldshaving amino acid sequences of FR1 to FR4 below. Scaffold Name FR1 FR2FR3 FR4 MG1X8 QVQLVESGGGLVQPGG WVRQAPG RFTISRDNAKNTLFLQM WGQGALVTSLRLSCAASGFTF KGLVWVS NSLRDEDTSVYYCAR VSS MG2X1 EVQLVESGGGLVQPGG WVRQAPGRFTISRDNSKNTLYLQM WGQGTLVT SLRLSCAASGFTF KGLEWVS NSLRAEDTAVYYCAS VSSMG2X1-34 QVQLVESGGNVVQPGT WVRQAPG RFTISRDNSRNTVFLQM WGQGILVTVSLRLSCAASGFTF KGLEWVA TSLRAEDTAVYYCGR SS MG2X2-12 QVQLVQSGAEVKKPGAWVRQAPG RVTLTRDTSTRTVYMEL WGQGTLVT SVKISCEASGYAF QGLEWMG KNLRSADTGVYYCARVSS MG2X2-13 EVQLLESGGGVVQPGK WVRQAPG RFTISRDNSKTMVNLQM WGQGTLVTSLRLSCVGSGFSF KGLEWLA NSLRPDDTAVYFCAR VSS MG3X1 QVQLVESGGGVVQPGR WLRQAPGRFTISRDNSKNTLYLEM CGQGTLVTV SLRLSCVASGFNF KGLEWVA NSLRPEDTAVYYCAK SSMG3X10 EVQLVESGGGLVKPGG WVRQAPG RFTISRDDSKNMVYLQ YGQGTLVTV SLRVSCAASGFTFKGLEWVG MNSLKTEDTAVYYCTT SS MG4X1-8 EVQLVESGGGLVQPGG WVRQGPGRFTISRDNAKNTVYLEM WGQGALVT SLRLSCAASGFSF EGLVWLS NSVRVDDTAVYYCVS VSSMG4X1-33 QVQLVESGGGLVQPGG WVRQAPG RFTISRDDSTNTLYLQV WGRGTLVTSLRLSCEASGFPF KGLEWVS NSLRAEDTAVYYCAK VSS MG4X1-35 EVQLLESGGGLVKPGGWVRQAPG RFTVSRDNVQKSLDLQ WGQGTTVT SLRLSCVGSERSF KGLEWVA MDSLRAEDTAVYFCARVSS MG4X3-27 EVQLLESGGGLAQSGG WVRQAPG RFTISRDIAKNSLYLQM WGQGALVTSLRLSCAASGFTF KGLEWIS NSLRDEDTAVYYCAK VSS MG4X4-2 EVQLVQSGAEVKKPGEWARDKPG HVTISSDRSVSVAYLQW WGQGTLVT SLRISCRGSGYRF KGLEWIG DSLKASDNGIYYCALVSS MG4X4-4 EVQLVESGGGLVQPGG WVRQAPG RFTISRDNAEDTLFLQM WGQGVLVTSLRLSCVPSGFTF KGLVWVS NSLRVDDTAVYYCVR VSS MG4X4-25 QVQLVESGGGLVQPGGWVRRSPGK RFTVSRDNAKNSLFLQM WGQGTMVT SLRLSCIASGFSL GLEWVA NNVRPEDTALYFCARVSS MG4X4-44 EVQLVESGGGLVQPGG WVRQAPG RFTISRDNAKNSLYLQM WGQGTLVTSLRLSCAASGFTF KGLEWVA NSLRAEDTALYYCAR VSS MG4X5-30 EVQLLESGGGLVQPGGWVRQAPG RFTISRNNAKNSLYLQM WGQGTLVT SLRLSCAASGFTF KGLEWLS NSLRVDDTAVYYCARVSS MG4X6-27 EVQLLESGGGLVQPGG WVRQGPG RFTISRDNAENSLYLQV WGQGALVTSLRLSCAASGFTF KGLEWVA NSLRAEDTAIYYCAK VSS MG4X6-48 EVQLLESGGGVVQPGRWVRQAPG RFTISRDIATNRLYLQM WGQGTLVT SLRLSCEVFGFTL RRLEWVA RSLRAEDTALYYCARVSS MG4X7-15 EVQLLESGGGLVQPGG WVRQAPG RFTISRDNSKNTLYLQM WGQGTTVTSLRLSCAASGFSF KGLEWVS NSLRVEDTAVYYCAV VSS MG4X8-24 EVQLLESGGGLVQPGGWVRQAPG RFTISRDNSNNTLYLQM WGQGTLVT SLRLSCAASGFTF KGLEWVS NSLRADDTAVYFCAKVSS MG0.5X-1 QVQLVESGGGLVQPGG WVRQVPG RFTISRDNAKNSLYLQM WGQGTLVTSLRLSCAASGFTF KGLEWVA NSLRAEDTAVYYCAN VSS MG0.5X-3 QVQLVESGGGLVQPGGWVRQAPG RFTISRDNSKNTLYLQM WGXGTMVT SLTLSCAASGFTF TGLLWLS NSLRAEDTAVYYCARVSX MG0.5X-4 EVQLLESGGMLVKPGE WVRHAPG RLSISRDDSMNTVYLDI WGQGTPVTSLRLSCVGSGLIF KGLEWVG YNLKIDDTGVYYCTF VSS MG0.5X-14 EVQLLESGGGLVHAGGWVRQAPG RFTISRDNSKNSMYLQM WGQGTVVT SVRLSCAASGFTF KGLEWVA NSLRVEDTAVYYCARVSS MG0.75X-4 QVQLVESGGGLVKPGG WLRQAPG RFIISRDDSNDMLYLEMI GSQGTLVTVSLRLSCAASGFTF KGPEYVA SLKSEDTAVYYCSD SS MG2X-5 EVQLLESGGGLVQPGG WVRQAPGRFTISRDNSKNTLYLHM WGQGTLVT SLRLSCAASGFTF KGLEWVS NSLRAEDTAVYYCVK VSSMG2X-15 QVQLVESGGGLVQPGG WVRQAPG RFTISRDNSKNTLYLQM WGQGTLVTSLRLSCAASGFTF KGLEWVS NSLRAEDTAVYYCAK VSS MG4X-5 QVQLVESGGGLVQPGGWVRQAPG RFTVSRDNSRNTLYLQ WGQGTMVT SLRLSCEASGLHF KGLEWVA MKSLSAEDTAVYYCAKVSS MG1-4 QVQLVEAGGGLVQPGG WVRQAPG RFTISRDNSQNSLFLQM WGQGTMVTSLRLACAASGFTF KGLEWIS NSLRAEDTAVYYCAT VSS MG1-6 EVQLVQSGAEVKKPGE WVRQMPGHVTISVDKSISTAYLQW WGQGTLVT SLRKSCKGSGYSF KGLEWMG SSLKASDSAMYYFL VSSMG1-7 QVQLVESGGGLVQPGG WVRQAPG RFTISRDNAKNSLYLQM WGQGTLVT SLRLSCAASGFTFKGLEWVA NSLRDEDTAVYYCAR VSS MG1-8 EVQLVQSGAEVKKPGA WVRQAPGRVTMTRDTSSTTAYME WGQGTLVT SVKVSCKASGYTF QGLEWMG LNRLTSDDTAVYFCAR VSSMG1-9 EVQLVEAGGGLVQPGG WVRQAPG RFTISRDNAQNSLFLQM WGQGTMVT SLRLACAASGFTFKGLEWIS NSLRAEDTAVYYCAT VSS MG1-10 EVQLVQSGAEVKKPGE WVRQMPGQVTMSANRSISTAYLQ WGQGTTVT SLKISCKGSGYSF RGLEWLG WSSLKASDTGIYYCAT VSSMG5-1 QVQLVESGGGLIQPGES WVRQAPG RFTISRDSTQNTVHLQM WGQGTLVT LRLSCEAFGFTVKGLEWVS NSLTAEDTAVYYCAR VSS MG5-2 EVQLVQSGAELKKPGS WVRQAPGRLILSVDEPTRTVYMEL WGQGTTVT SVKVSCTSSGGSF QGLEWMG TSLRSDDTAMYYCAR VSSMG5-4 EVQLLESGGGLVQPGR WVRQAPG RFTISRDNAKDSLYLQM WGQGTMVT SLRLSCAASGFTFKGLEWVS NSLRPEDTALYYCAR VSS MG5-5 EVQLLESGGGVVQPGR WVRQAPGRFTISRDYSNKIVHLEM WGQGTLVT SLRLSCVASGFTF KGLEWVS DSLRAEDTAVYFCVR VSSMG5-6 EVQLLESGGGLVKPGG WVRQAPG RFTISRDDSRDMLYLQM SSQGTLVTV SLRLSCAASGFTFKGLECVA NNLKTEDTAVYYCSD SS MG5-7 EVQLVESGGGLVQPGR WVRQAPGRFTISRDDSKSIVYLQMS WGRGTLVT SLRLSCTTSGFSF KGLEWVS SLQTEDTAVYYCSR VSSMG5-9 EVQLLESGGGLARPGG WVRQAPG TISRDNAKNSVYLQMNS WGQGTLVT SLRLSCSASGFAFKGLEWVS LRAEDSAVYFCAR VSS MG10-1 QVQLVESGGNVVQPGT WVRQAPGRFTISRDNSRNTVFLQM WGQGILVTV SLRLSCAASGFTF KGLEWVA TSLRAEDTAVYYCGR SSMG10-2 EVQLLESGGGLVQPGG WVRQAPG RFTISRDNAKDSLYLQM APQGTLVTVSLRLTCVGYGFTF KGPEWVA DSLRPEDTAVYYCAR SS MG10-4 EVQLLESGGGLVQPGG WVRQAPGQFTISRDNAKNTLYLQM WGQGTMVT SLRLSCAASGFIL KGLVWVS NSLRVEDTAVYYCAR VSSMG10-5 EVQLLESGGGVVHPGR WVRQAPD RFTVSRDISKNTVYLQM WGQGTMVT SLRLSCAVSGFSLKGLEWVA NSLRAEDTALYYCAR VSS MG10-6 EVQLLESGGGLVQPGG WFRQGPGKRFTISRDDSKNSLSLQM WGQGTVVT SRRLSCAASGFTF GLEWVA DSLRTEDTAVYYCVR VSSMG10-8 QVQLVESGGGVVQPGR WVRQTPG RFTISRDNSNNTVYLEM WGLGTVVT SLRLSCVASGFAFRGLEWLA NSLRPEDSAIYYCAK VSS MG10-10 QVQLVESGGVVVQPGG WVRQAPGRFTISRDNSKNSLYLQM WGQGTLVT SLRLSCAASGFTF KGLEWVS NSLRTDETALYYCV VSS MG2EVQLLESGGGLVQPGG WVRQAPG RFTISRDNAKNSLYLQM WGQGTTVT SLRLSCAASGFTFKGLEWVS NSLRTDETAVYYCAR VSS MG5 EVQLLQSGGGWVKPG WVRQAPGRFTISIDESRNALFLHMN WGQGTLVT GSLRLSCAASGFIC KGLEWVG SLTTDDTAVYYCST VSSMG6 EVQLLESGGVVVQPGR WVRQAPG RFTVSRDTSTNTLYLQM WGQGTLVT SLRLSCAASGFTFKGLEWVA NSLRVEDTAVYYCAR VSS MG7 QMQLVQSEAEVKKPGA WVRQATGRVTMTRNTSISTAYMEL WGQGTLVT SMKVSCKASGYTF QGLEWMG SSLTSADTAVYYCAR VSSMG10 QVQLVQSGAEVKKPGE WVRQMPG QVTISADKSISTAFLQW WGLGTLVTV SLKISCKGSGYSFKGLEWMG NSLKASDTAMYYCAR SS


24. The VH domain antibody scaffold of claim 21, wherein the VH domainantibody scaffold is selected from scaffolds having amino acid sequencesof FR1 to FR4 below. Scaffold Name FR1 FR2 FR3 FR4 MG8-21EVQLVESGGGLVQPG WVRNAPGKG RFTISRDNSKNTLYLQM WGQGTLVTVS GSLRLSCAASGFTFNEIVS NSLRAEDTAVYYCAS S MG2-12L EVQLVESGGGLVQPG WVRRAPGKGRFTISRDNSKNTLYLQM WGQGTLVTVS GSLRLSCAASGFTF IEVVS NSLRAEDTAVYYCAS SMG2-7I EVQLVESGGGLVQPG WVRIAPGKG RFTISRDNSKNTLYLQM WGQGTLVTVSGSLRLSCAASGFTF PEPVS NSLRAEDTAVYYCAS S MG2-9I EVQLVESGGGLVQPG WVRKAPGKGRFTISRDNSKNTLYLQM WGQGTLVTVS GSLRLSCAASGFTF YEPVS NSLRAEDTAVYYCAS SMG2-10I EVQLVESGGGLVQPG WVRNAPGKG RFTISRDNSKNTLYLQM WGQGTLVTVSGSLRLSCAASGFTF YEIVS NSLRAEDTAVYYCAS S MG2-11I EVQLVESGGGLVQPG WVRYAPGKGRFTISRDNSKNTLYLQM WGQGTLVTVS GSLRLSCAASGFTF YEFVS NSLRAEDTAVYYCAS SMG2-12I EVQLVESGGGLVQPG WVRVAPGKG RFTISRDNSKNTLYLQM WGQGTLVTVSGSLRLSCAASGFTF IEPVS NSLRAEDTAVYYCAS S MG2-32 EVQLVESGGGLVQPG WVRMAPGKRFTISRDNSKNTLYLQM WGQGTLVTVS GSLRLSCAASGFTF GPEHVS NSLRAEDTAVYYCAS SMG2-34 EVQLVESGGGLVQPG WVRSAPGKG RFTISRDNSKNTLYLQM WGQGTLVTVSGSLRLSCAASGFTF VEMVS NSLRAEDTAVYYCAS S MG2-40 EVQLVESGGGLVQPG WVRTAPGKGRFTISRDNSKNTLYLQM WGQGTLVTVS GSLRLSCAASGFTF TEMVS NSLRAEDTAVYYCAS SMG2-46 EVQLVESGGGLVQPG WVRCAPGKG RFTISRDNSKNTLYLQM WGQGTLVTVSGSLRLSCAASGFTF YEFVS NSLRAEDTAVYYCAS S MG2-47 EVQLVESGGGLVQPG WVRIAPGKGRFTISRDNSKNTLYLQM WGQGTLVTVS GSLRLSCAASGFTF LEMVS NSLRAEDTAVYYCAS SMG2-48 EVQLVESGGGLVQPG WVRMAPGK RFTISRDNSKNTLYLQM WGQGTLVTVSGSLRLSCAASGFTF GLEYVS NSLRAEDTAVYYCAS S MG2-51 EVQLVESGGGLVQPG WVRYAPGKGRFTISRDNSKNTLYLQM WGQGTLVTVS GSLRLSCAASGFTF TEFVS NSLRAEDTAVYYCAS SMG2-53 EVQLVESGGGLVQPG WVRQAPGKG RFTISRDNSKNTLYLQM WGQGTLVTVSGSLRLSCAASGFTF VEWVS NSLRAEDTAVYYCAS S MG2-55 EVQLVESGGGLVQPG WVRWAPGKRFTISRDNSKNTLYLQM WGQGTLVTVS GSLRLSCAASGFTF GPEFVS NSLRAEDTAVYYCAS SMG2-57 EVQLVESGGGLVQPG WVRFAPGKG RFTISRDNSKNTLYLQM WGQGTLVTVSGSLRLSCAASGFTF REWVS NSLRAEDTAVYYCAS S MG2-58 EVQLVESGGGLVQPG WVRFAPGKGRFTISRDNSKNTLYLQM WGQGTLVTVS GSLRLSCAASGFTF CELVS NSLRAEDTAVYYCAS SMG2-59 EVQLVESGGGLVQPG WVRKAPGKG RFTISRDNSKNTLYLQM WGQGTLVTVSGSLRLSCAASGFTF LETVS NSLRAEDTAVYYCAS S MG2-60 EVQLVESGGGLVQPG WVRNAPGKGRFTISRDNSKNTLYLQM WGQGTLVTVS GSLRLSCAASGFTF LECVS NSLRAEDTAVYYCAS SMG2-64 EVQLVESGGGLVQPG WVRCAPGKG RFTISRDNSKNTLYLQM WGQGTLVTVSGSLRLSCAASGFTF WEVVS NSLRAEDTAVYYCAS S MG4-12 EVQLVESGGGLVQPG WVRLAPGKGRFTISRDNSKNTLYLQM WGQGTLVTVS GSLRLSCAASGFTF VELVS NSLRAEDTAVYYCAS SMG4-13 EVQLVESGGGLVQPG WVRFAPGKG RFTISRDNSKNTLYLQM WGQGTLVTVSGSLRLSCAASGFTF AEWVS NSLRAEDTAVYYCAS S MG4-17 EVQLVESGGGLVQPG WVRLAPGKGRFTISRDNSKNTLYLQM WGQGTLVTVS GSLRLSCAASGFTF REWVS NSLRAEDTAVYYCAS SMG4-18 EVQLVESGGGLVQPG WVRYAPGKG RFTISRDNSKNTLYLQM WGQGTLVTVSGSLRLSCAASGFTF VEFVS NSLRAEDTAVYYCAS S MG4-20 EVQLVESGGGLVQPG WVRFAPGKGRFTISRDNSKNTLYLQM WGQGTLVTVS GSLRLSCAASGFTF LEMVS NSLRAEDTAVYYCAS SMG4-28 EVQLVESGGGLVQPG WVRVAPGKG RFTISRDNSKNTLYLQM WGQGTLVTVSGSLRLSCAASGFTF TERVS NSLRAEDTAVYYCAS S MG4-2 EVQLVESGGGLVQPG WVRIAPGKGRFTISRDNSKNTLYLQM WGQGTLVTVS GSLRLSCAASGFTF MEMVS NSLRAEDTAVYYCAS SMG4-32 EVQLVESGGGLVQPG WVRAAPGKG RFTISRDNSKNTLYLQM WGQGTLVTVSGSLRLSCAASGFTF PELVS NSLRAEDTAVYYCAS S MG4-33 EVQLVESGGGLVQPG WVRVAPGKGRFTISRDNSKNTLYLQM WGQGTLVTVS GSLRLSCAASGFTF YEHVS NSLRAEDTAVYYCAS SMG4-34 EVQLVESGGGLVQPG WVRVAPGKG RFTISRDNSKNTLYLQM WGQGTLVTVSGSLRLSCAASGFTF LECVS NSLRAEDTAVYYCAS S MG4-5 EVQLVESGGGLVQPG WVRVAPGKGRFTISRDNSKNTLYLQM WGQGTLVTVS GSLRLSCAASGFTF PETVS NSLRAEDTAVYYCAS SMG4-6 EVQLVESGGGLVQPG WVRMAPGK RFTISRDNSKNTLYLQM WGQGTLVTVSGSLRLSCAASGFTF GSEVVS NSLRAEDTAVYYCAS S MG4-7 EVQLVESGGGLVQPG WVRLAPGKGRFTISRDNSKNTLYLQM WGQGTLVTVS GSLRLSCAASGFTF TEMVS NSLRAEDTAVYYCAS SMG8-11 EVQLVESGGGLVQPG WVRTAPGKG RFTISRDNSKNTLYLQM WGQGTLVTVSGSLRLSCAASGFTF AEWVS NSLRAEDTAVYYCAS S MG8-12 EVQLVESGGGLVQPG WVRWAPGKRFTISRDNSKNTLYLQM WGQGTLVTVS GSLRLSCAASGFTF GKEVVS NSLRAEDTAVYYCAS SMG8-13 EVQLVESGGGLVQPG WVRQAPGKG RFTISRDNSKNTLYLQM WGQGTLVTVSGSLRLSCAASGFTF IEPVS NSLRAEDTAVYYCAS S MG8-14 EVQLVESGGGLVQPG WVRQAPGKGRFTISRDNSKNTLYLQM WGQGTLVTVS GSLRLSCAASGFTF PEWVS NSLRAEDTAVYYCAS SMG8-4 EVQLVESGGGLVQPG WVRQAPGKG RFTISRDNSKNTLYLQM WGQGTLVTVSGSLRLSCAASGFTF PEVVS NSLRAEDTAVYYCAS S MG8-5 EVQLVESGGGLVQPG WVRTAPGKGRFTISRDNSKNTLYLQM WGQGTLVTVS GSLRLSCAASGFTF IEIVS NSLRAEDTAVYYCAS SMG8-6 EVQLVESGGGLVQPG WVRIAPGKG RFTISRDNSKNTLYLQM WGQGTLVTVSGSLRLSCAASGFTF VEIVS NSLRAEDTAVYYCAS S MG8-8 EVQLVESGGGLVQPG WVRAAPGKGRFTISRDNSKNTLYLQM WGQGTLVTVS GSLRLSCAASGFTF LEVVS NSLRAEDTAVYYCAS S


25. A polynucleotide encoding the VH domain antibody scaffold of claim21.
 26. A VH domain antibody having the VH domain antibody scaffold ofclaim
 21. 27. A polynucleotide encoding the VH domain antibody of claim26.
 28. A VH domain antibody library having human-derived random CDRH1,CDRH2 and CDRH3 inserted into the VH domain antibody scaffold of claim21.
 29. The VH domain antibody library of claim 28, wherein the insertedCDRH3 has 5 to 15 amino acid residues.
 30. The VH domain antibodylibrary of claim 28, wherein the inserted CDRH3 has 7 to 13 amino acidresidues.
 31. The VH domain antibody library of claim 28, wherein thehuman-derived random CDRH1, CDRH2 and CDRH3 have induced mutationtherein.
 32. The VH domain antibody library of claim 31, wherein themutation is induced at one or more positions selected from position nos.30 and 31 of CDRH1, position no. 53 of CDRH2, and position nos. 97, 99,100, and 100a of CDRH3, based on Kabat numbering system.
 33. The VHdomain antibody library of claim 28, wherein the library is naïve,synthetic or immune library.
 34. A method of screening a VH domainantibody having binding ability to a desired antigen by using thelibrary of claim
 28. 35. The method of claim 33, wherein the screeningof the VH domain antibody is performed by eluting VH domain antibodiesthat are not bound to fixed desired antigens except VH domain antibodiesthat are bound thereto.
 36. The method of claim 34, wherein the elutingof the VH domain antibodies that are not bound to fixed desired antigensis repeated twice or more.